MGR集群主节点宕机,未发生主从切换
MySQL5.7.34 MGR集群单主模式,节点信息:+--------------------------------------+--------------+-------------+--------------+------------+
| MEMBER_ID | MEMBER_HOST| MEMBER_PORT | MEMBER_STATE | IS_PRIMARY |
+--------------------------------------+--------------+-------------+--------------+------------+
| 9c88649e-9f2f-11ee-ba5c-8c2a8e5c1504 | l76-186-p-yz | 3306 | ONLINE | YES |
| 9d7be087-9f2f-11ee-8595-8c2a8e5c187f | l76-187-p-yz | 3306 | ONLINE | NO |
| 9e5b9e54-9f2f-11ee-af5c-8c2a8e5c1516 | l76-188-p-yz | 3306 | ONLINE | NO |
+--------------------------------------+--------------+-------------+--------------+------------+
主节点l76-186-p-yz被机房管理员在搬机器的过程中不小心碰到了电源键,导致异常关机。主节点宕机后,未发生主从切换。两个从节点的日志如下:
2025-09-10T17:47:34.576495+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:47:46.923357+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:48:34.603089+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:48:46.953037+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:49:34.528252+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:49:46.978338+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:50:34.559733+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:50:47.007786+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:51:34.584621+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:51:47.034747+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:52:34.609476+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:52:47.060166+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:53:34.633964+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:53:47.082814+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:54:34.657171+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:54:47.105487+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:55:34.673285+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
两个从节点的视图未改变,仍然认为l76-186-p-yz是主节点,视图如下:
+--------------------------------------+--------------+-------------+--------------+------------+
| MEMBER_ID | MEMBER_HOST| MEMBER_PORT | MEMBER_STATE | IS_PRIMARY |
+--------------------------------------+--------------+-------------+--------------+------------+
| 9c88649e-9f2f-11ee-ba5c-8c2a8e5c1504 | l76-186-p-yz | 3306 | ONLINE | YES |
| 9d7be087-9f2f-11ee-8595-8c2a8e5c187f | l76-187-p-yz | 3306 | ONLINE | NO |
| 9e5b9e54-9f2f-11ee-af5c-8c2a8e5c1516 | l76-188-p-yz | 3306 | ONLINE | NO |
+--------------------------------------+--------------+-------------+--------------+------------+
请问,什么原因导致主节点宕机后,集群未发生主从切换?
1、旧的主节点 l76-186-p-yz宕机时间前后历时约多久?
2、MGR相关参数配置也贴出来
P.S,5.7的MGR很不成熟,应尽快迁移到8.0+版本
yejr 发表于 2025-9-15 11:13
1、旧的主节点 l76-186-p-yz宕机时间前后历时约多久?
2、MGR相关参数配置也贴出来
P.S,5.7的MGR很不成熟 ...
2025-09-10 17:45:07关机
2025-09-10 17:58:22开机
my.cnf配置:
group_replication_exit_state_action='ABORT_SERVER'
group_replication_compression_threshold = 131072
group_replication_transaction_size_limit = 104857600
group_replication_unreachable_majority_timeout = 60
group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-202199991111"
group_replication_start_on_boot=off
group_replication_local_address= "10.26.76.187:24901"
group_replication_group_seeds="10.26.76.186:24901,10.26.76.187:24901,10.26.76.188:24901"
group_replication_bootstrap_group=off
group_replication_ip_whitelist="10.26.56.0/16"
group_replication_flow_control_mode='DISABLED' pengzhaojing 发表于 2025-9-15 11:25
2025-09-10 17:45:07关机
2025-09-10 17:58:22开机
my.cnf配置:
我看你贴的从节点日志是从17:47:34.576495开始的,但是主节点是17:45:07关机的,这中间的日志也补一下。
有较大可能性是主节点宕机时,正好处于某个临界状态,导致和从节点的通信状态不能及时更新。 yejr 发表于 2025-9-15 12:03
我看你贴的从节点日志是从17:47:34.576495开始的,但是主节点是17:45:07关机的,这中间的日志也补一下。
...
2025-09-10T17:40:41.098205+08:00 70875209 Got an error reading communication packets
2025-09-10T17:40:41.109856+08:00 70875210 Got an error reading communication packets
2025-09-10T17:41:41.098764+08:00 70875286 Got an error reading communication packets
2025-09-10T17:42:41.098787+08:00 70875365 Got an error reading communication packets
2025-09-10T17:43:23.873322+08:00 70875201 Aborted connection 70875201 to db: 'unconnected' user: 'inceptionadmin' host: '10.26.158.148' (Got an error reading communication packets)
2025-09-10T17:43:41.099224+08:00 70875442 Got an error reading communication packets
2025-09-10T17:44:41.099158+08:00 70875521 Got an error reading communication packets
2025-09-10T17:45:41.099461+08:00 70875600 Got an error reading communication packets
2025-09-10T17:45:41.114146+08:00 70875601 Got an error reading communication packets
2025-09-10T17:46:41.099769+08:00 70875676 Got an error reading communication packets
2025-09-10T17:47:34.576495+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:47:41.100019+08:00 70875754 Got an error reading communication packets
2025-09-10T17:47:46.923357+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:48:34.603089+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:48:41.100117+08:00 70875830 Got an error reading communication packets
2025-09-10T17:48:46.953037+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:49:34.528252+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:49:41.100580+08:00 70875906 Got an error reading communication packets
2025-09-10T17:49:46.978338+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:50:34.559733+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:50:41.100425+08:00 70875981 Got an error reading communication packets
2025-09-10T17:50:41.117296+08:00 70875982 Got an error reading communication packets
2025-09-10T17:50:47.007786+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:51:34.584621+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:51:41.101013+08:00 70876060 Got an error reading communication packets
2025-09-10T17:51:47.034747+08:00 0 Plugin group_replication reported: 'The member with address l76-187-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:52:34.609476+08:00 0 Plugin group_replication reported: 'The member with address l76-188-p-yz:3306 has already sent the stable set. Therefore discarding the second message.'
2025-09-10T17:52:41.100827+08:00 70876138 Got an error reading communication packets pengzhaojing 发表于 2025-9-15 14:16
2025-09-10T17:40:41.098205+08:00 70875209 Got an error reading communication packets
2025-0 ...
旧的主节点宕机后,从节点上的日志并没有相关信息,看起来有较大可能是碰上了临界场景,这种没办法,只能手动处理了。另外,还是再次建议尽快升级到8.0或8.4版本,尤其是建议用GreatSQL 8.0的MGR更可靠稳定。 yejr 发表于 2025-9-15 12:03
我看你贴的从节点日志是从17:47:34.576495开始的,但是主节点是17:45:07关机的,这中间的日志也补一下。
...
临界状态具体的表现形式是什么? reddey 发表于 2025-9-15 17:03
临界状态具体的表现形式是什么?
这需要根据具体场景去跟踪代码才知道 :) yejr 发表于 2025-9-15 17:04
这需要根据具体场景去跟踪代码才知道 :)
原来这样
页:
[1]