0

考虑这个事件循环:

  1. 虽然更多消息
  2. msg_in = consumer.Poll()
  3. msg_out = 转换(msg_in)
  4. service.Publish(msg_out)

假设一个分区并专注于第四行。当此循环崩溃时,表示有 5 条消息按 1、2、3、4、5 的顺序发送到 Kafka。而这些卡夫卡只有 N<=5。休息丢失。

如果没有重试,我们能说什么?卡夫卡得到 1 或 1,2 或 1,2,3 或 1,2,3,4 或 1,2,3,4,5?Kafka 确实保证每个分区的排序。

如果有重试当然会丢失排序,并且 Kafka 可能会得到任何排列 P,而不是它确实得到的 m 从 m = 0 到 N。这是可以理解的。

我正在使用来自 confluent 的 rdkafka 的 golang 包装,但让我们只关注 rdkafka 本身。

4

1 回答 1

0

its even more complicated than you think :-)

librdkafka supports max.in.flight.requests.per.connection. if you set that to 1 retries should be safe to enable under some circumstances (namely, if they are infinite. any setting that discards undelivered data may reorder).

on newer versions enable.idempotence will improve this guarantee up to max in-flight of 5.

another "interesting" scenario could be that records 1, 2, 3, 4 and delivered, leader broker crashes, an unclean leader is appointed, record 4 is dropped, and then 5 is delivered, resulting in 1, 2, 3, 5 in the partition.

or maybe the topic in question is log-compacted and some of these records have the exact same key?

于 2019-09-21T05:15:41.363 回答