clickhouse操作(表，列，数据删除等)

clickhouse表操作文章目录clickhouse表操作1、建表语句2、表字段操作2.1 增加字段2.2 删除字段2.3 修改字段2.4 字段添加备注2.5 解释3、删除数据3.1普通sql方式3.2 按分区批量删除3.3 解释3.4 例：4、删除表5、查看clickhouse进程信息1、建表语句-- 物理表create table test-clickhouse.test_table_loc

微不足道的张三

13702人浏览 · 2022-04-09 01:36:07

微不足道的张三 · 2022-04-09 01:36:07 发布

clickhouse表操作

文章目录

- clickhouse表操作

1、建表语句

-- 物理表
create table test-clickhouse.test_table_local [on cluster clu] (
id Int8,
user_name String,
nick_name String
)
ENGINE = MergeTree
PARTITION BY id
ORDER BY  (id)
SETTINGS storage_policy = 'test-clickhouse';
-- 分布式表
CREATE TABLE test-clickhouse.test_table  [on cluster clu ]
AS test-clickhouse.test_table_local
ENGINE = Distributed(clu, test-clickhouse, test_table_local,sipHash64(id));

test-clickhouse：库名
test_table_local：物理表名称（存放数据）
[on cluster clu]：基于集群创建表（clu为集群名称），可以省略不写，表示在当前节点创建表
PARTITION BY：设置分区
ORDER BY:设置排序
ENGINE：设置引擎，MergeTree（合并树引擎），Distributed（分布式引擎，用于创建分布式表）

Distributed参数：
    cluster：集群名称，
    databases：库名，
    target table：对应的物理表表名，
    [ sharding key ]：数据分片键

2、表字段操作

2.1 增加字段

alter table table_name [on cluster clu_name] add column column_name column_type [comment "message"]

2.2 删除字段

alter table table_name [on cluster clu_name] drop column column_name

2.3 修改字段

alter table table_name [on cluster clu_name] rename column col_1 to col_2

2.4 字段添加备注

alter table table_name [on cluster clu_name] comment column column_name "message"

2.5 解释

table_name：物理表表名
[on cluster clu_name]：所在集群，可省略
column_name / col_1 / col_2：需要添加的字段名称
[comment "message"] ：字段备注，可省略
"message"：添加的备注信息

3、删除数据

3.1 普通sql方式

alter table table_name delete [where ......]

3.2 按分区批量删除

alter table table_name [on cluster clu_name]  drop partition [分区字段的值];

3.3 解释

table_name：物理表表名
[on cluster clu_name]：集群（clu_name为集群名称），可省略

3.4 例：

table_name：user，
clu_name：clu
分区字段：id

id(PARTITION)	name	age
11	zhang	21
11	li	23
12	lu	30
13	wang	16

执行：

alter table user on cluster clu  drop partition '11';
-- 删除user表中分区字段id，id=11的数据，结果剩下id为12、13的分区

4、删除表

drop table table_name [on cluster clu_name];

5、查看clickhouse进程信息

select * from system.processes;
-- 查看IP
select initial_address from system.processes

6、注意

1、on cluster cluster_name；这个指令使得操作能在集群范围内的节点都生效。

2、创建表、删除表时注意确认是否在集群所有节点创建，确保物理表与分布式表节点一致

3、分布式表

接收到数据后会将数据拆分成多个parts, 并转发数据到其它服务器
数据先在分布式表所在的机器进行落盘, 然后异步的发送到本地表所在机器进行存储

4、注意分区与副本的关系。分区（分片）指按照某字段存储，能提高查询效率（类似hive的分区）；副本是节点间存放相同的数据（备份）
5、对表格的操作（结构、数据）要注意：如果部署了集群一定要基于集群修改，否则可能会发现没有变化（修改了某一节点的表，查的是另一节点的表）；不要操作分布式表；另外，修改数据时，修改的字段尽量不要参与排序等，避免更新失败