database - Getting clickhouse data replication error across 3 nodes . but my ddl commands are working fine? - Stack Overflow

Here's the properly formatted Stack Overflow post:Title: ClickHouse Replication with ClickHouse Ke

Here's the properly formatted Stack Overflow post:


Title: ClickHouse Replication with ClickHouse Keeper - Queries Only Hitting One Replica

Body:

I have a ClickHouse setup using Docker with the following configuration:

  • 1 shard
  • 3 replicas
  • Using ClickHouse Keeper Server instead of an external ZooKeeper

Issue

Even though I have three replicas, all queries seem to be hitting only one replica. I was expecting ClickHouse to distribute read queries across the replicas for better performance.

Setup Details

I have three VMs:

  • VM2replica 02
  • VM3replica 03

Here's my cluster configuration:

<yandex>
    <remote_servers>
        <my_cluster>
            <shard>
                <replica>
                    <host>vm2</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>vm3</host>
                    <port>9000</port>
                </replica>
            </shard>
        </my_cluster>
    </remote_servers>

    <zookeeper>
        <node>
            <host>vm1</host>
            <port>2181</port>
        </node>
    </zookeeper>
</yandex>

What I Have Tried

  1. Running SELECT * FROM system.replicas; → All replicas show as active and in sync.
  2. Running SELECT * FROM system.clusters WHERE cluster = 'my_cluster'; → Shows all replicas correctly.
  3. Using distributed table:
    SELECT * FROM distributed_table;
    
    But still, only one replica is handling the queries.

Questions:

  1. How can I ensure that ClickHouse distributes queries across all replicas?
  2. Do I need to configure a specific load balancing strategy for read queries?
  3. Is there any way to confirm which replica is being used for queries?

Any guidance would be greatly appreciated!

Here's the properly formatted Stack Overflow post:


Title: ClickHouse Replication with ClickHouse Keeper - Queries Only Hitting One Replica

Body:

I have a ClickHouse setup using Docker with the following configuration:

  • 1 shard
  • 3 replicas
  • Using ClickHouse Keeper Server instead of an external ZooKeeper

Issue

Even though I have three replicas, all queries seem to be hitting only one replica. I was expecting ClickHouse to distribute read queries across the replicas for better performance.

Setup Details

I have three VMs:

  • VM2replica 02
  • VM3replica 03

Here's my cluster configuration:

<yandex>
    <remote_servers>
        <my_cluster>
            <shard>
                <replica>
                    <host>vm2</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>vm3</host>
                    <port>9000</port>
                </replica>
            </shard>
        </my_cluster>
    </remote_servers>

    <zookeeper>
        <node>
            <host>vm1</host>
            <port>2181</port>
        </node>
    </zookeeper>
</yandex>

What I Have Tried

  1. Running SELECT * FROM system.replicas; → All replicas show as active and in sync.
  2. Running SELECT * FROM system.clusters WHERE cluster = 'my_cluster'; → Shows all replicas correctly.
  3. Using distributed table:
    SELECT * FROM distributed_table;
    
    But still, only one replica is handling the queries.

Questions:

  1. How can I ensure that ClickHouse distributes queries across all replicas?
  2. Do I need to configure a specific load balancing strategy for read queries?
  3. Is there any way to confirm which replica is being used for queries?

Any guidance would be greatly appreciated!

Share Improve this question asked Mar 13 at 6:07 Subhashree Mohan SwainSubhashree Mohan Swain 11 silver badge 1
  • Maybe try to set priority in the replica like here clickhouse/docs/engines/table-engines/special/… – jsc0218 Commented Mar 13 at 14:26
Add a comment  | 

2 Answers 2

Reset to default 1

How can I ensure that ClickHouse distributes queries across all replicas?

distributed queries by default use only one live replica per shard
you can change it with SETTINGS max_parallel_replicas=3
and check
SELECT hostName(), count() FROM db.distributed_table GROUP BY ALL

Do I need to configure a specific load balancing strategy for read queries?

use <load_balacing> only if you understand why do you need it, for example nearest_hostname usually used for geo-distributed cluster to select from the same region which encoded in DNS hostnames in <remote_servers> <host>

Is there any way to confirm which replica is being used for queries?

SELECT hostName(), count() FROM db.distributed_table GROUP BY ALL

Actually I found a solution and it worked. I have used interserver communication path but , may be that was wrong path. So , i just commented out that path in my config file. Now clickhouse automatically settled it's interserver communication path. And my replication work fine now.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744718143a4589746.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信