Here's the properly formatted Stack Overflow post:
Title: ClickHouse Replication with ClickHouse Keeper - Queries Only Hitting One Replica
Body:
I have a ClickHouse setup using Docker with the following configuration:
- 1 shard
- 3 replicas
- Using ClickHouse Keeper Server instead of an external ZooKeeper
Issue
Even though I have three replicas, all queries seem to be hitting only one replica. I was expecting ClickHouse to distribute read queries across the replicas for better performance.
Setup Details
I have three VMs:
- VM2 →
replica 02
- VM3 →
replica 03
Here's my cluster configuration:
<yandex>
<remote_servers>
<my_cluster>
<shard>
<replica>
<host>vm2</host>
<port>9000</port>
</replica>
<replica>
<host>vm3</host>
<port>9000</port>
</replica>
</shard>
</my_cluster>
</remote_servers>
<zookeeper>
<node>
<host>vm1</host>
<port>2181</port>
</node>
</zookeeper>
</yandex>
What I Have Tried
- Running
SELECT * FROM system.replicas;
→ All replicas show as active and in sync. - Running
SELECT * FROM system.clusters WHERE cluster = 'my_cluster';
→ Shows all replicas correctly. - Using
distributed
table:
But still, only one replica is handling the queries.SELECT * FROM distributed_table;
Questions:
- How can I ensure that ClickHouse distributes queries across all replicas?
- Do I need to configure a specific load balancing strategy for read queries?
- Is there any way to confirm which replica is being used for queries?
Any guidance would be greatly appreciated!
Here's the properly formatted Stack Overflow post:
Title: ClickHouse Replication with ClickHouse Keeper - Queries Only Hitting One Replica
Body:
I have a ClickHouse setup using Docker with the following configuration:
- 1 shard
- 3 replicas
- Using ClickHouse Keeper Server instead of an external ZooKeeper
Issue
Even though I have three replicas, all queries seem to be hitting only one replica. I was expecting ClickHouse to distribute read queries across the replicas for better performance.
Setup Details
I have three VMs:
- VM2 →
replica 02
- VM3 →
replica 03
Here's my cluster configuration:
<yandex>
<remote_servers>
<my_cluster>
<shard>
<replica>
<host>vm2</host>
<port>9000</port>
</replica>
<replica>
<host>vm3</host>
<port>9000</port>
</replica>
</shard>
</my_cluster>
</remote_servers>
<zookeeper>
<node>
<host>vm1</host>
<port>2181</port>
</node>
</zookeeper>
</yandex>
What I Have Tried
- Running
SELECT * FROM system.replicas;
→ All replicas show as active and in sync. - Running
SELECT * FROM system.clusters WHERE cluster = 'my_cluster';
→ Shows all replicas correctly. - Using
distributed
table:
But still, only one replica is handling the queries.SELECT * FROM distributed_table;
Questions:
- How can I ensure that ClickHouse distributes queries across all replicas?
- Do I need to configure a specific load balancing strategy for read queries?
- Is there any way to confirm which replica is being used for queries?
Any guidance would be greatly appreciated!
Share Improve this question asked Mar 13 at 6:07 Subhashree Mohan SwainSubhashree Mohan Swain 11 silver badge 1- Maybe try to set priority in the replica like here clickhouse/docs/engines/table-engines/special/… – jsc0218 Commented Mar 13 at 14:26
2 Answers
Reset to default 1How can I ensure that ClickHouse distributes queries across all replicas?
distributed queries by default use only one live replica per shard
you can change it with SETTINGS max_parallel_replicas=3
and check
SELECT hostName(), count() FROM db.distributed_table GROUP BY ALL
Do I need to configure a specific load balancing strategy for read queries?
use <load_balacing>
only if you understand why do you need it, for example nearest_hostname
usually used for geo-distributed cluster to select from the same region which encoded in DNS hostnames in <remote_servers>
<host>
Is there any way to confirm which replica is being used for queries?
SELECT hostName(), count() FROM db.distributed_table GROUP BY ALL
Actually I found a solution and it worked. I have used interserver communication path but , may be that was wrong path. So , i just commented out that path in my config file. Now clickhouse automatically settled it's interserver communication path. And my replication work fine now.
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744718143a4589746.html
评论列表(0条)