Redis cluster reshard操作失败,[WARNING] The following slots are open: xxxx问题如何恢复?【详细步骤】
以实战的方式,逐步说明在reshard redis cluster 的时候中途出现错误导致reshard失败,应该如何恢复的详细操作步骤。
·
reshard是进行redis集群管理的一个重要手段。常用于集群节点的伸缩。当你在进行reshard的时候,集群突然由于某种原因挂掉了,reshard过程戛然而止。这时你有没有感觉到慌张呢?现在我们一起来看看如何进行恢复。本例中假设在reshard过程中一台redis节点挂掉了(本例通过停止节点容器来模拟)。
检查节点情况
使用cluster check命令检查节点情况:
$ redis-cli --cluster reshard 192.168.1.196:6379 -a xxxxx
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.1.196:6679: Connection refused
>>> Performing Cluster Check (using node 192.168.1.196:6379)
M: d1851a905c4daf870dc9ca7c28c3ebe29585af07 192.168.1.196:6379
slots:[0-3276] (3277 slots) master
M: bfc47617d26387e800c1ff1b714d25175caaaa60 192.168.1.196:6579
slots:[6554-9982] (3429 slots) master
M: 74c8b8101bc810e82d98926c9d0f60b8f1b5d163 192.168.1.196:6479
slots:[3277-6553] (3277 slots) master
M: d93bf1672e1d83180bb156740592129e957f7289 192.168.1.196:6779
slots:[13107-16383] (3277 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
[WARNING] Node 192.168.1.196:6579 has slots in importing state 9983.
[WARNING] The following slots are open: 9983.
>>> Check slots coverage...
[ERR] Not all 16384 slots are covered by nodes.
可以看到2个问题,
1. slot 9983处于开放状态。
2. slots覆盖不完全。
首先,确保所有节点已启动
由于刚才是被我认为关掉了一个master节点,所以只要把那个节点启动回来就行了。
启动之后再使用cluster check命令:
$ redis-cli --cluster check 192.168.1.196:6379 -a xxxxxx
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.1.196:6379 (d1851a90...) -> 19083 keys | 3277 slots | 0 slaves.
192.168.1.196:6579 (bfc47617...) -> 19732 keys | 3429 slots | 0 slaves.
192.168.1.196:6479 (74c8b810...) -> 19184 keys | 3277 slots | 0 slaves.
192.168.1.196:6679 (61022de0...) -> 18095 keys | 3124 slots | 0 slaves.
192.168.1.196:6779 (d93bf167...) -> 19080 keys | 3277 slots | 0 slaves.
[OK] 95174 keys in 5 masters.
5.81 keys per slot on average.
>>> Performing Cluster Check (using node 192.168.1.196:6379)
M: d1851a905c4daf870dc9ca7c28c3ebe29585af07 192.168.1.196:6379
slots:[0-3276] (3277 slots) master
M: bfc47617d26387e800c1ff1b714d25175caaaa60 192.168.1.196:6579
slots:[6554-9982] (3429 slots) master
M: 74c8b8101bc810e82d98926c9d0f60b8f1b5d163 192.168.1.196:6479
slots:[3277-6553] (3277 slots) master
M: 61022de0e08f419c571c892188ac1e689d9528e6 192.168.1.196:6679
slots:[9983-13106] (3124 slots) master
M: d93bf1672e1d83180bb156740592129e957f7289 192.168.1.196:6779
slots:[13107-16383] (3277 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
[WARNING] Node 192.168.1.196:6579 has slots in importing state 9983.
[WARNING] Node 192.168.1.196:6679 has slots in migrating state 9983.
[WARNING] The following slots are open: 9983.
>>> Check slots coverage...
[OK] All 16384 slots covered.
发现slot覆盖问题已经解决了。但是slot9983 open问题还在。
然后,使用cluster fix命令恢复
$ redis-cli --cluster fix 192.168.1.196:6379 -a xxxxxx
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.1.196:6379 (d1851a90...) -> 19083 keys | 3277 slots | 0 slaves.
192.168.1.196:6579 (bfc47617...) -> 19732 keys | 3429 slots | 0 slaves.
192.168.1.196:6479 (74c8b810...) -> 19184 keys | 3277 slots | 0 slaves.
192.168.1.196:6679 (61022de0...) -> 18095 keys | 3124 slots | 0 slaves.
192.168.1.196:6779 (d93bf167...) -> 19080 keys | 3277 slots | 0 slaves.
[OK] 95174 keys in 5 masters.
5.81 keys per slot on average.
>>> Performing Cluster Check (using node 192.168.1.196:6379)
M: d1851a905c4daf870dc9ca7c28c3ebe29585af07 192.168.1.196:6379
slots:[0-3276] (3277 slots) master
M: bfc47617d26387e800c1ff1b714d25175caaaa60 192.168.1.196:6579
slots:[6554-9982] (3429 slots) master
M: 74c8b8101bc810e82d98926c9d0f60b8f1b5d163 192.168.1.196:6479
slots:[3277-6553] (3277 slots) master
M: 61022de0e08f419c571c892188ac1e689d9528e6 192.168.1.196:6679
slots:[9983-13106] (3124 slots) master
M: d93bf1672e1d83180bb156740592129e957f7289 192.168.1.196:6779
slots:[13107-16383] (3277 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
[WARNING] Node 192.168.1.196:6579 has slots in importing state 9983.
[WARNING] Node 192.168.1.196:6679 has slots in migrating state 9983.
[WARNING] The following slots are open: 9983.
>>> Fixing open slot 9983
*** Found keys about slot 9983 in non-owner node 192.168.1.196:6579!
Set as migrating in: 192.168.1.196:6679
Set as importing in: 192.168.1.196:6579
>>> Nobody claims ownership, selecting an owner...
*** Configuring 192.168.1.196:6579 as the slot owner
>>> Case 2: Moving all the 9983 slot keys to its owner 192.168.1.196:6579
Moving slot 9983 from 192.168.1.196:6679 to 192.168.1.196:6579:
>>> Setting 9983 as STABLE in 192.168.1.196:6679
>>> Check slots coverage...
[OK] All 16384 slots covered.
命令参数说明:
参数 | 说明 |
--cluster | 集群操作命令 |
fix | fix子命令 |
192.168.1.196:6379 | 集群中任意一个节点的ip地址和端口 |
-a xxxxxx | 该节点的登录密码 |
这个命令会先执行一次check操作,然后自动对开放的slot进行修复。
检查
$ redis-cli --cluster check 192.168.1.196:6379 -a xxxxxx
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.1.196:6379 (d1851a90...) -> 19083 keys | 3277 slots | 0 slaves.
192.168.1.196:6579 (bfc47617...) -> 19732 keys | 3430 slots | 0 slaves.
192.168.1.196:6479 (74c8b810...) -> 19184 keys | 3277 slots | 0 slaves.
192.168.1.196:6679 (61022de0...) -> 18095 keys | 3123 slots | 0 slaves.
192.168.1.196:6779 (d93bf167...) -> 19080 keys | 3277 slots | 0 slaves.
[OK] 95174 keys in 5 masters.
5.81 keys per slot on average.
>>> Performing Cluster Check (using node 192.168.1.196:6379)
M: d1851a905c4daf870dc9ca7c28c3ebe29585af07 192.168.1.196:6379
slots:[0-3276] (3277 slots) master
M: bfc47617d26387e800c1ff1b714d25175caaaa60 192.168.1.196:6579
slots:[6554-9983] (3430 slots) master
M: 74c8b8101bc810e82d98926c9d0f60b8f1b5d163 192.168.1.196:6479
slots:[3277-6553] (3277 slots) master
M: 61022de0e08f419c571c892188ac1e689d9528e6 192.168.1.196:6679
slots:[9984-13106] (3123 slots) master
M: d93bf1672e1d83180bb156740592129e957f7289 192.168.1.196:6779
slots:[13107-16383] (3277 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
完成!
更多推荐
已为社区贡献3条内容
所有评论(0)