2

我使用 Apache Kafka 作为 Docker 容器https://hub.docker.com/r/wurstmeister/kafka/

我能够使用 Spring Kafka 从我的 Java 应用程序成功连接到 Kafka。

但是当我尝试通过 Ruby Kafka 从 Ruby 应用程序连接到 Kafka 时,我收到以下错误:

Uncaught exception: Failed to find group coordinator

Java 和 Ruby 应用程序之间的唯一区别是 Ruby 应用程序位于我本地网络中的另一台机器上,但我可以从 Ruby 机器上看到 Kafka 机器以及那里的所有端口。

如何发现问题并解决?

更新

I, [2018-06-25T10:06:49.513848 #62261]  INFO -- : New topics added to target list: post.sent
I, [2018-06-25T10:06:49.514036 #62261]  INFO -- : Fetching cluster metadata from kafka://10.0.0.102:9093
D, [2018-06-25T10:06:49.514262 #62261] DEBUG -- : Opening connection to 10.0.0.102:9093 with client id test...
D, [2018-06-25T10:06:49.518350 #62261] DEBUG -- : Sending topic_metadata API request 1 to 10.0.0.102:9093
D, [2018-06-25T10:06:49.519336 #62261] DEBUG -- : Waiting for response 1 from 10.0.0.102:9093
D, [2018-06-25T10:06:49.530220 #62261] DEBUG -- : Received response 1 from 10.0.0.102:9093
I, [2018-06-25T10:06:49.530351 #62261]  INFO -- : Discovered cluster metadata; nodes: 10.0.75.1:9093 (node_id=1001)
D, [2018-06-25T10:06:49.530439 #62261] DEBUG -- : Closing socket to 10.0.0.102:9093
I, [2018-06-25T10:06:49.530682 #62261]  INFO -- : Joining group `my_group`
D, [2018-06-25T10:06:49.530812 #62261] DEBUG -- : Getting group coordinator for `my_group`
D, [2018-06-25T10:06:49.531019 #62261] DEBUG -- : Opening connection to 10.0.75.1:9093 with client id test...
D, [2018-06-25T10:06:49.616368 #62261] DEBUG -- : Handling fetcher command: subscribe
I, [2018-06-25T10:06:49.616797 #62261]  INFO -- : Will fetch at most 1048576 bytes at a time per partition from post.sent
D, [2018-06-25T10:06:49.617262 #62261] DEBUG -- : Handling fetcher command: configure
D, [2018-06-25T10:06:49.617462 #62261] DEBUG -- : Handling fetcher command: start
D, [2018-06-25T10:06:49.617599 #62261] DEBUG -- : Fetching batches
I, [2018-06-25T10:06:49.618108 #62261]  INFO -- : Fetching cluster metadata from kafka://10.0.0.102:9093
D, [2018-06-25T10:06:49.619053 #62261] DEBUG -- : Opening connection to 10.0.0.102:9093 with client id test...
D, [2018-06-25T10:06:49.624053 #62261] DEBUG -- : Sending topic_metadata API request 1 to 10.0.0.102:9093
D, [2018-06-25T10:06:49.625459 #62261] DEBUG -- : Waiting for response 1 from 10.0.0.102:9093
D, [2018-06-25T10:06:49.635283 #62261] DEBUG -- : Received response 1 from 10.0.0.102:9093
I, [2018-06-25T10:06:49.635468 #62261]  INFO -- : Discovered cluster metadata; nodes: 10.0.75.1:9093 (node_id=1001)
D, [2018-06-25T10:06:49.635596 #62261] DEBUG -- : Closing socket to 10.0.0.102:9093
I, [2018-06-25T10:06:49.635853 #62261]  INFO -- : There are no partitions to fetch from, sleeping for 1s
D, [2018-06-25T10:06:50.637187 #62261] DEBUG -- : Fetching batches
I, [2018-06-25T10:06:50.637804 #62261]  INFO -- : There are no partitions to fetch from, sleeping for 1s
D, [2018-06-25T10:06:51.642172 #62261] DEBUG -- : Fetching batches
I, [2018-06-25T10:06:51.642471 #62261]  INFO -- : There are no partitions to fetch from, sleeping for 1s
D, [2018-06-25T10:06:52.645354 #62261] DEBUG -- : Fetching batches
I, [2018-06-25T10:06:52.645640 #62261]  INFO -- : There are no partitions to fetch from, sleeping for 1s
D, [2018-06-25T10:06:53.647833 #62261] DEBUG -- : Fetching batches
I, [2018-06-25T10:06:53.648259 #62261]  INFO -- : There are no partitions to fetch from, sleeping for 1s
D, [2018-06-25T10:06:54.650357 #62261] DEBUG -- : Fetching batches
I, [2018-06-25T10:06:54.650647 #62261]  INFO -- : There are no partitions to fetch from, sleeping for 1s
D, [2018-06-25T10:06:55.652582 #62261] DEBUG -- : Fetching batches
I, [2018-06-25T10:06:55.653477 #62261]  INFO -- : There are no partitions to fetch from, sleeping for 1s
D, [2018-06-25T10:06:56.657937 #62261] DEBUG -- : Fetching batches
I, [2018-06-25T10:06:56.659627 #62261]  INFO -- : There are no partitions to fetch from, sleeping for 1s
D, [2018-06-25T10:06:57.664130 #62261] DEBUG -- : Fetching batches
I, [2018-06-25T10:06:57.664861 #62261]  INFO -- : There are no partitions to fetch from, sleeping for 1s
D, [2018-06-25T10:06:58.666290 #62261] DEBUG -- : Fetching batches
I, [2018-06-25T10:06:58.666620 #62261]  INFO -- : There are no partitions to fetch from, sleeping for 1s
E, [2018-06-25T10:06:59.534809 #62261] ERROR -- : Timed out while trying to connect to 10.0.75.1:9093: Operation timed out
D, [2018-06-25T10:06:59.535083 #62261] DEBUG -- : Closing socket to 10.0.75.1:9093
E, [2018-06-25T10:06:59.535342 #62261] ERROR -- : Failed to get group coordinator info from 10.0.75.1:9093 (node_id=1001): Operation timed out
I, [2018-06-25T10:06:59.535567 #62261]  INFO -- : Leaving group `my_group`
D, [2018-06-25T10:06:59.535709 #62261] DEBUG -- : Getting group coordinator for `my_group`
D, [2018-06-25T10:06:59.535875 #62261] DEBUG -- : Opening connection to 10.0.75.1:9093 with client id test...
D, [2018-06-25T10:06:59.666983 #62261] DEBUG -- : Handling fetcher command: stop
E, [2018-06-25T10:07:09.540409 #62261] ERROR -- : Timed out while trying to connect to 10.0.75.1:9093: Operation timed out
D, [2018-06-25T10:07:09.540833 #62261] DEBUG -- : Closing socket to 10.0.75.1:9093
E, [2018-06-25T10:07:09.541172 #62261] ERROR -- : Failed to get group coordinator info from 10.0.75.1:9093 (node_id=1001): Operation timed out
Exiting
Uncaught exception: Failed to find group coordinator
4

1 回答 1

2

您的经纪人返回Group Coordinator Not Available错误。在正常情况下,这应该是您的 kafka 集群配置协调节点时的临时情况。不幸的是,在您的情况下,您的集群中的某些东西运行不正常,并且没有分配协调器。

您应该从日志开始检查您的集群配置。

您可能会发现这里发布的解决方案很有用,我引用:

当使用 bootstrap-server 参数时,连接是通过 Brokers 而不是 Zookeeper。Brokers 使用 __consumer_offsets 来存储有关每个主题的已提交偏移量的信息:每个消费者组的分区 (groupID)。在这种情况下,__consumer_offsets 指向无效的代理 ID。因此,显示了上述异常。要检查此主题的代理 ID 是否正确,请执行以下命令:

kafka-topics.sh --describe --zookeeper <zkHost:zkPort> --topic __consumer_offsets

然后,使用以下命令与 Zookeeper 中注册的 Broker 进行比较:

zkCli.sh -server <zkHost:zkPort>

连接到 Zookeeper 后,使用以下命令检查 Brokers ID:

[zk: server1.openstacklocal:2181(CONNECTED) 0] ls /brokers/ids

如果 Broker ID 不匹配,则继续执行本文的解决方案。

解决方案:

要解决此问题,请执行以下操作:

Connect to Zookeeper using the following command:
 zkCli.sh -server <zkHost:zkPort>

使用以下命令删除 __consumer_offset:

 znode rmr /brokers/topics/__consumer_offset

重新启动代理。

旁注:有时这个问题会在 zookeeper 仍在启动时发生。

于 2018-06-24T06:17:00.907 回答