WARN [Consumer clientId=consumer-1, groupId=console-consumer-55928] 1 partitions have leader brokers
一 问题描述同事反馈我们的三节点kafka集群当其中一台服务器宕机后,业务受到影响,无法生产与消费消息。程序报错:WARN [Consumer clientId=consumer-1, groupId=console-consumer-55928] 1 partitions have leader brokers without a matching listener, including [ba
一 问题描述
同事反馈我们的三节点kafka集群当其中一台服务器宕机后,业务受到影响,无法生产与消费消息。程序报错:
WARN [Consumer clientId=consumer-1, groupId=console-consumer-55928] 1 partitions have leader brokers without a matching listener, including [baidd-0] (org.apache.kafka.clients.NetworkClient)
二 故障模拟
2.1 分区的replicas为1时情形
#生产消息
[root@Centos7-Mode-V8 kafka]# bin/kafka-console-producer.sh --broker-list 192.168.144.247:9193,192.168.144.251:9193,192.168.144.253:9193 --topic baidd
>aa
>bb
#正常时能收到消息:
[root@Centos7-Mode-V8 kafka]# bin/kafka-console-consumer.sh -bootstrap-server 192.168.144.247:9193,192.168.144.251:9193,192.168.144.253:9193 --topic baidd
aa
bb
2.1.1 模拟关掉该topic所属leader节点
#用kafka tool查看该topic的分区的leader在哪个节点上
关掉其leader节点,发现生产者和所有消费者进程都一直在刷如下信息:
[2021-09-23 17:09:53,495] WARN [Consumer clientId=consumer-1, groupId=console-consumer-55928] 1 partitions have leader brokers without a matching listener, including [baidd-0] (org.apache.kafka.clients.NetworkClient)
无法发送消息,也无法消费消息。
2.1.2 模拟关掉非leader节点
有时消费者进程会报错:[2021-09-23 17:21:22,480] WARN [Consumer clientId=consumer-1, groupId=console-consumer-55928] Connection to node 2147483645 (/192.168.144.253:9193) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
报错期间可以正常生产消息,但无法消费这中间产生的数据。
2.1.3 总结
在分区只有一个replicat情况下,停掉任意一个节点,都会影响业务。
其中,当某个分区leader所在节点宕机,会影响生产消息与消费消息。
当非leader节点宕机,会影响消费消息。
2.2 分区有多个副本情形
分区在无其他副本情况下,影响业务可以理解,因此尝试为topic配置多个副本,发现竟然还是影响业务:
#创建一个拥有三副本的topic
bin/kafka-topics.sh --create --zookeeper 192.168.144.247:3292,192.168.144.251:3292,192.168.144.253:3292 --replication-factor 3 --partitions 1 --topic song
#查看副本信息
[root@Centos7-Mode-V8 kafka]# bin/kafka-topics.sh --zookeeper 192.168.144.247:3292,192.168.144.251:3292,192.168.144.253:3292 --describe --topic song
Topic:song PartitionCount:1 ReplicationFactor:3 Configs:
Topic: song Partition: 0 Leader: 0 Replicas: 0,2,1 Isr: 0,2,1
#发消息
bin/kafka-console-producer.sh --broker-list 192.168.144.247:9193,192.168.144.251:9193,192.168.144.253:9193 --topic song
#消费进程1
bin/kafka-console-consumer.sh -bootstrap-server 192.168.144.247:9193,192.168.144.251:9193,192.168.144.253:9193 --topic song --group g1
#消费进程2
bin/kafka-console-consumer.sh -bootstrap-server 192.168.144.247:9193,192.168.144.251:9193,192.168.144.253:9193 --topic song --group g2
#模拟关掉该topic所属leader节点
发现还能生产消息,没有报1 partitions have leader brokers without a matching listener错了,但是发现有时消费者在连不上topic leader后,有时报错:
[2021-09-24 19:01:06,316] WARN [Consumer clientId=consumer-1, groupId=console-consumer-27609] Connection to node 2147483647 (/192.168.144.247:9193) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
这期间生产的数据有时没有过来,无法消费节点故障期间产生的消息。
只是为什么有了多个副本之后节点宕机还是会丢消息呢?
答:__consumer_offsets只有1个副本,会导致即使拥有多个副本的topic也无法实现高可用。
#后来通过扩kafka自带的这个topic(__consumer_offsets)的副本,可以实现其他普通topic的高可用了。
三 故障定位
Kafka配置文件中没配置default.replication.factor,而该参数默认为1,因此相当于是单点。
四 解决办法
- 修改kafka配置文件,调大topic的默认副本因子(该参数默认为1):
default.replication.factor=3
设置了default.replication.factor=3,offsets.topic.replication.factor也会默认为3。
注意,不要设置了default.replication.factor=3,又设置offsets.topic.replication.factor=1,这样offsets.topic.replication.factor的值会覆盖default.replication.factor的值。
#重启kafka
略
- 为现有普通topic扩副本
可参考https://blog.csdn.net/yabingshi_tech/article/details/120443647
- 为__consumer_offset扩副本
方法同上,json文件内容如下:
{
"version": 1,
"partitions": [
{
"topic": "__consumer_offsets",
"partition": 0,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 1,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 2,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 3,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 4,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 5,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 6,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 7,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 8,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 9,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 10,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 11,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 12,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 13,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 14,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 15,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 16,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 17,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 18,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 19,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 20,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 21,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 22,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 23,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 24,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 25,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 26,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 27,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 28,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 29,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 30,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 31,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 32,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 33,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 34,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 35,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 36,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 37,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 38,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 39,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 40,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 41,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 42,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 43,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 44,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 45,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 46,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 47,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 48,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 49,
"replicas": [
2,
0,
1
]
}
]
}
--批量扩上述topic的副本脚本
#此脚本用于将kafka集群里现有topic的副本调整为3.
if [[ $1 = "" ]]
then
echo '请在调用该脚本时传入kafka集群对应的zk地址参数,格式:ip1:port,ip2:port,ip3:port。参考示例:'
echo 'sh expand_replica.sh 192.168.144.247:3292,192.168.144.251:3292,192.168.144.253:3292'
exit
else
echo '1.创建相关目录,用于存放扩副本的json文件...'
cd /opt/
mkdir -p kafka_json
cd kafka_json
echo '2.扩普通topic的副本...'
#查询都有哪些普通topic
str1=$(/usr/local/kafka/bin/kafka-topics.sh -zookeeper $1 --list | grep -v __consumer_offsets)
echo "$str1" > topic.txt
#为每个topic生成一个扩副本的json文件
for TopicName in `cat /opt/kafka_json/topic.txt`
do
echo ' '
echo '开始为'$TopicName'扩副本(若下方输出Successfully started reassignment of partitions,表示副本扩成功)...'
#生成扩副本的json文件
cat>${TopicName}.json<<EOF
{
"version": 1,
"partitions": [
{
"topic": "$TopicName",
"partition": 0,
"replicas": [
0,
1,
2
]
}
]
}
EOF
#为普通topic扩副本
/usr/local/kafka/bin/kafka-reassign-partitions.sh --zookeeper $1 --reassignment-json-file ${TopicName}.json --execute
done
echo '3.扩__consumer_offsets的副本...'
#生成json文件
cat>__consumer_offsets.json<<EOF
{
"version": 1,
"partitions": [
{
"topic": "__consumer_offsets",
"partition": 0,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 1,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 2,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 3,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 4,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 5,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 6,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 7,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 8,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 9,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 10,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 11,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 12,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 13,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 14,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 15,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 16,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 17,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 18,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 19,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 20,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 21,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 22,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 23,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 24,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 25,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 26,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 27,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 28,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 29,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 30,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 31,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 32,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 33,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 34,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 35,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 36,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 37,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 38,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 39,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 40,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 41,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 42,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 43,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 44,
"replicas": [
2,
0,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 45,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 46,
"replicas": [
0,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 47,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 48,
"replicas": [
1,
2,
0
]
},
{
"topic": "__consumer_offsets",
"partition": 49,
"replicas": [
2,
0,
1
]
}
]
}
EOF
#扩副本
/usr/local/kafka/bin/kafka-reassign-partitions.sh --zookeeper $1 --reassignment-json-file __consumer_offsets.json --execute
fi
--本篇文章参考了Kafka突然宕机了?稳住,莫慌!
更多推荐
所有评论(0)