clickhouse集群搭建
clickhouse集群搭建
ck服务配置
- ch_id保证唯一,用于区分clickhouse服务,需要多少个服务执行多少次
ch_id=1;
helm template /media/app-audit/helm-charts/component/clickhouse-1.3.0.tgz --set-string cfg.namespace=app-audit --set-string cfg.prod_name=app-audit --set-string cfg.host_opt_base_dir=/opt/dsd/ --set-string cfg.dns_domain=qzprod --set-string cfg.cluster_regi_url=registry.common.svc.qzprod:8092/ --set cfg.enabled.secret=true --set-string resources.limits.cpu=1000m --set-string resources.requests.cpu=100m --set-string cfg.clickhouse.max_memory_usage=2G --set cfg.name.svc=clickhouse-server-${ch_id} --set cfg.name.hl_svc=clickhouse-headless-service-${ch_id} --set cfg.name.deploy=clickhouse-app-${ch_id} --set cfg.name.pod=clickhouse-app-${ch_id} --set cfg.name.sn=tabix-secret-${ch_id} --set cfg.name.ing=clickhouse-ingress-${ch_id} --set cfg.name.cm=clickhouse-conf-${ch_id} --set cfg.name.initdb=init-clickhouse-db-${ch_id} --set cfg.name.metrica=clickhouse-metrica-${ch_id} --set nodeSelector.clickhouse=${ch_id} --set cfg.labels.pod.ins=clickhouse-${ch_id} --set cfg.labels.svc.app=clickhouse-${ch_id} --set cfg.labels.deploy.app=clickhouse-${ch_id} --set cfg.labels.hl_svc.app=clickhouse-${ch_id} | akapply -f -
ch_id=2;
helm template /media/app-audit/helm-charts/component/clickhouse-1.3.0.tgz --set-string cfg.namespace=app-audit --set-string cfg.prod_name=app-audit --set-string cfg.host_opt_base_dir=/opt/dsd/ --set-string cfg.dns_domain=qzprod --set-string cfg.cluster_regi_url=registry.common.svc.qzprod:8092/ --set cfg.enabled.secret=true --set-string resources.limits.cpu=1000m --set-string resources.requests.cpu=100m --set-string cfg.clickhouse.max_memory_usage=2G --set cfg.name.svc=clickhouse-server-${ch_id} --set cfg.name.hl_svc=clickhouse-headless-service-${ch_id} --set cfg.name.deploy=clickhouse-app-${ch_id} --set cfg.name.pod=clickhouse-app-${ch_id} --set cfg.name.sn=tabix-secret-${ch_id} --set cfg.name.ing=clickhouse-ingress-${ch_id} --set cfg.name.cm=clickhouse-conf-${ch_id} --set cfg.name.initdb=init-clickhouse-db-${ch_id} --set cfg.name.metrica=clickhouse-metrica-${ch_id} --set nodeSelector.clickhouse=${ch_id} --set cfg.labels.pod.ins=clickhouse-${ch_id} --set cfg.labels.svc.app=clickhouse-${ch_id} --set cfg.labels.deploy.app=clickhouse-${ch_id} --set cfg.labels.hl_svc.app=clickhouse-${ch_id} | akapply -f -
- 查看clickhouse-server
akget svc | grep clickhouse
给节点打标签
# 根据kafka部署数量,给节点打上选择标签
# clickhouse-app-1 则节点选择器为clickhouse=1,clickhouse-app-2 则节点选择器为clickhouse=2
kubectl label node 节点1 clickhouse=1 --overwrite
kubectl label node 节点2 clickhouse=2 --overwrite
...
- 查看标签和节点是否对应
akget po -o wide | grep clickhouse
生成通用配置
akget cm clickhouse-conf-1 -o yaml > /tmp/clickhouse-cluster-conf.yaml
- 编辑yaml
vi /tem/clickhouse-cluster-conf.yaml
apiVersion: v1
data:
config.xml: "<?xml version=\"1.0\"?>\n<yandex>\n <path>/var/lib/clickhouse/data/</path>\n
\ <tmp_path>/var/lib/clickhouse/data/tmp/</tmp_path>\n <user_files_path>/var/lib/clickhouse/data/user_files/</user_files_path>\n
\ <format_schema_path>/var/lib/clickhouse/data/format_schemas/</format_schema_path>\n
\ \n <include_from>/etc/clickhouse-server/metrica.d/metrica.xml</include_from>\n\n
\ <users_config>users.xml</users_config>\n \n <display_name>clickhouse</display_name>\n
\ <listen_host>0.0.0.0</listen_host>\n <http_port>8123</http_port>\n <tcp_port>9000</tcp_port>\n
\ <interserver_http_port>9009</interserver_http_port>\n <max_connections>4096</max_connections>\n
\ <keep_alive_timeout>28800</keep_alive_timeout>\n <max_concurrent_queries>4096</max_concurrent_queries>\n
\ <uncompressed_cache_size>8589934592</uncompressed_cache_size>\n <mark_cache_size>5368709120</mark_cache_size>\n
\ <timezone>Asia/Shanghai</timezone>\n <umask>022</umask>\n <mlock_executable>false</mlock_executable>\n
\ <remote_servers incl=\"clickhouse_remote_servers\" optional=\"true\" />\n
\ <zookeeper incl=\"zookeeper-servers\" optional=\"true\" />\n <macros incl=\"macros\"
optional=\"true\" />\n <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>\n
\ <max_session_timeout>36000</max_session_timeout>\n <default_session_timeout>6000</default_session_timeout>\n
\ <disable_internal_dns_cache>1</disable_internal_dns_cache>\n <merge_tree>\n
\ <parts_to_delay_insert>300</parts_to_delay_insert>\n <parts_to_throw_insert>900</parts_to_throw_insert>\n
\ </merge_tree>\n <query_log>\n <database>system</database>\n <table>query_log</table>\n
\ <partition_by>toYYYYMM(event_date)</partition_by>\n <flush_interval_milliseconds>7500</flush_interval_milliseconds>\n
\ </query_log>\n\n <query_thread_log>\n <database>system</database>\n
\ <table>query_thread_log</table>\n <partition_by>toYYYYMM(event_date)</partition_by>\n
\ <flush_interval_milliseconds>7500</flush_interval_milliseconds>\n </query_thread_log>\n
\ <max_table_size_to_drop>0</max_table_size_to_drop>\n <max_partition_size_to_drop>0</max_partition_size_to_drop>\n
\ <distributed_ddl>\n <path>/clickhouse/task_queue/ddl</path>\n </distributed_ddl>\n
\ <logger>\n <level>warning</level>\n <log>/var/lib/clickhouse/log//clickhouse-server.log</log>\n
\ <errorlog>/var/lib/clickhouse/log//clickhouse-server.err.log</errorlog>\n
\ <size>500M</size>\n <count>10</count>\n </logger>\n <max_open_files>2.62144e+06</max_open_files>\n</yandex>"
remote_servers.xml: |-
<yandex>
<include_from>/etc/clickhouse-server/metrica.d/metrica.xml</include_from>
<remote_servers incl="clickhouse_remote_servers">
<clicks_cluster> <!-- 集群自定义名称 -->
<shard> <!-- 分片,多个分片写多个shard -->
<internal_replication>false</internal_replication> <!-- 默认false,true表示insert分布式表时,会在分片的所有副本都写入一份 -->
<replica> <!-- 副本,没个分片中可以有多个副本,每个副本用replica标签包含 -->
<host>clickhouse-app-1-0.clickhouse-headless-service-1.app-audit.svc.qzprod</host>
<user>audit</user> <!-- user -->
<password>f4b858bd47d88e99f0b7ad6adb4d2631ce28370929f1891179eb2186c7dd0a10</password> <!-- 密码 -->
<port>9000</port> <!-- 端口 -->
</replica>
</shard>
<shard>
<internal_replication>false</internal_replication>
<replica>
<host>clickhouse-app-2-0.clickhouse-headless-service-2.app-audit.svc.qzprod</host>
<user>audit</user>
<password>f4b858bd47d88e99f0b7ad6adb4d2631ce28370929f1891179eb2186c7dd0a10</password>
<port>9000</port>
</replica>
</shard>
</clicks_cluster>
</remote_servers>
<listen_host>0.0.0.0</listen_host>
<path>/var/lib/clickhouse/data/</path>
<tmp_path>/var/lib/clickhouse/data/tmp/</tmp_path>
<user_files_path>/var/lib/clickhouse/data/user_files/</user_files_path>
<logger>
<level>debug</level>
<log>/var/lib/clickhouse/log/clickhouse-server.log</log>
<errorlog>/var/lib/clickhouse/log/clickhouse-server.err.log</errorlog>
<console>1</console>
</logger>
<compression incl="clickhouse_compression">
<case>
<method>lz4</method>
</case>
</compression>
<distributed_ddl>
<path>/clickhouse/task_queue/ddl</path>
</distributed_ddl>
<macros incl="macros">
</macros>
<zookeeper incl="zookeeper-servers">
<node index="1"> <!-- zk节点,可配置多个,用多个node标签,index唯一 -->
<host>zookeeper-server</host> <!-- host -->
<port>2182</port> <!-- 端口 -->
</node>
</zookeeper>
</yandex>
kind: ConfigMap
metadata:
annotations:
chart_version: 1.3.0
git-branch: charts-clickhouse-1.3.0
git-commit_hash: 3266d024f6fb59bb4d6b5f11af0536808a5c985c
git-committer_date: "2021-10-26T11:01:11+08:00"
git-committer_email: 2037461051@qq.com
git-committer_name: shihuai
git-repository: git@gitee.com:QuanZhiKeJi/olive-charts.git
git-revision: 3266d024f6fb59bb4d6b5f11af0536808a5c985c
labels:
app: clickhouse-1
name: clickhouse-cluster-conf
namespace: app-audit
- 应用配置
akapply -f /tmp/clickhouse-cluster-conf.yaml
- 查看配置,出现clickhouse-cluster-conf即为成功
akget cm
修改所有 clickhouse-metrica 修改配置中的macros,macros 唯一,每个分片独有一个macros
akedit cm clickhouse-metrica-1(配置名称)
<macros>
<shard>01</shard>
<replica>clickhouse-app-1-0.clickhouse-headless-service-1.app-audit.svc.qzprod</replica>
</macros>
</yandex>
修改所有 sts
akedit sts clickhouse-app-1
修改完成所有配置 重启所有服务
akdelete pod clickhouse-app-1-0(node)
验证
- 创建数据库
create database dbName on cluster clicks_cluster;
-
创建本地表
CREATE TABLE dbName.test_local on cluster clicks_cluster (`id` String) ENGINE = ReplicatedMergeTree PARTITION BY id ORDER BY (id) SETTINGS index_granularity = 8192
-
创建分布式表
create table dbName.test_all on cluster clicks_cluster as dbName.test_local ENGINE = Distributed(clicks_cluster,dbName,test_local,rand())
-
插入数据
insert into dbName.test_local values('1');
- 查询分布式表,正确结果会显示所有本地表数据汇总
select * from dbName.test_all
分片/副本介绍
- shard 代表分片,replica代表副本
下面代表1分片0副本
<shard>
<replica>
<replica>
</shard>
下面代表2分片1副本
<shard>
<replica>
<replica>
<replica>
<replica>
</shard>
<shard>
<replica>
<replica>
<replica>
<replica>
</shard>
-
replica 标签代表副本,一个分片中有N个副本就会有N个replica
- internal_replication 为true时会自动在分片内的所有副本都写入一边
- default_database 代表默认数据库
- host clickhouse所在服务ip地址映射
- port 端口号
- user 数据库用户名
- password 数据库密码
-
zookeeper配置
- host: zk所在服务器ip
- port:端口号
- macros配置
- shard 分片
- replica 副本
新增/减少集群配置
新增 分片/副本
- 新增clickhouse服务
-
修改clickhouse-cluster-conf,添加新的shard(分片)/replica(副本)标签,修改配置clickhouse-metrica,修改sts clickhouse-app
- 重启所有服务
- akexec -it clickhouse-app-1-0(服务名称) bash
- clickhouse-client 进入clickhouse客户端
- 输入sql:select * from system.clusters,出现下列配置即为配置生效
减少 分片/副本
- 在clichouse-cluster-conf中删除对应的shard(分片)/replica(副本)
- 重启所有服务
- akexec -it clickhouse-app-1-0(服务名称) bash
- clickhouse-client 进入clickhouse客户端
- 输入sql:select * from system.clusters,查看配置数量及对应host_name是否正确