edgar_mu 发表于 2024-11-25 15:44:12

添加仲裁节点失败

仲裁节点参数:

(Mon Nov 25 15:39:03 2024)[(none)]>show global variables like '%arbi%';
+------------------------------+-------+
| Variable_name                | Value |
+------------------------------+-------+
| group_replication_arbitrator | ON    |
+------------------------------+-------+

执行添加节点:c.add_instance('replication@xxxx:3306'); 成功后,
仲裁节点加入失败,错误日志信息为,磁盘空间不够,需要数据大小的空间
感觉group_replication_arbitrator参数不起作用一样


KAiTO 发表于 2024-11-25 16:32:43

首先您说 “执行添加节点:c.add_instance('replication@xxxx:3306'); 成功后”,是执行完成后没有报错是吗?那这个仲裁节点就是添加成功的状态了,为什么后面还加一句“仲裁节点加入失败”那是添加成功还是失败{:3_64:}

如果成功可以先用 SELECT * FROM performance_schema.replication_group_members; 看下各节点状态,看下仲裁节点是否是 ARBITRATOR

可以稍微描述清楚一些,错误日志也带上哈,方便分析~

yejr 发表于 2024-11-25 18:00:04

磁盘空间不够肯定不行了,仲裁节点是不存储用户数据和日志,但GreatSQL本身也要写error log,也是需要一定磁盘空间的。

edgar_mu 发表于 2024-11-25 19:26:16

yejr 发表于 2024-11-25 18:00
磁盘空间不够肯定不行了,仲裁节点是不存储用户数据和日志,但GreatSQL本身也要写error log,也是需要一定 ...

数据节点已经有1T数据了,仲裁节点要我1T的磁盘,是设计如此吗?

edgar_mu 发表于 2024-11-25 19:27:35

yejr 发表于 2024-11-25 18:00
磁盘空间不够肯定不行了,仲裁节点是不存储用户数据和日志,但GreatSQL本身也要写error log,也是需要一定 ...

2024-11-25T15:37:38.029970+08:00 25 Clone Apply set error code: 3873: Clone estimated database size is 1.52 TiB. Available space 423.45 GiB is not enough.
日志如上,另外,应该不需要Clone数据才对吧

edgar_mu 发表于 2024-11-25 19:31:08

本帖最后由 edgar_mu 于 2024-11-25 19:33 编辑


日志如下:
Please select a recovery method lone/bort (default Abort): C
Validating instance configuration at 10.xx.22:3306...

This instance reports its own address as 10.xx.22:3306

Instance configuration is suitable.
NOTE: Group Replication will communicate with other members using '10.xx22:33061'. Use the localAddress option to override.

A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Adding instance to the cluster...

NOTE: User 'mysql_innodb_cluster_330612022'@'%' already existed at instance '10.xx21:3306'. It will be deleted and created again with a new password.
Monitoring recovery process of the new cluster member. Press ^C to stop monitoring and let it continue in background.
NOTE: Could not detect state recovery method for '10.xx22:3306'

WARNING: An unknown error occurred in state recovery of the instance.
The instance '10.xx22:3306' was successfully added to the cluster.

MySQL10.xx21:33060+ sslPy > c.status()
{
    "clusterName": "sj_test_12",
    "defaultReplicaSet": {
      "name": "default",
      "primary": "10.xx.21:3306",
      "ssl": "REQUIRED",
      "status": "OK_NO_TOLERANCE",
      "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active.",
      "topology": {
            "10.xx.21:3306": {
                "address": "10.xx.21:3306",
                "memberRole": "PRIMARY",
                "mode": "R/W",
                "readReplicas": {},
                "replicationLag": null,
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.32"
            },
            "10.xx.22:3306": {
                "address": "10.xx.22:3306",
                "instanceErrors": [
                  "ERROR: group_replication has stopped with an error."
                ],
                "memberRole": "SECONDARY",
                "memberState": "ERROR",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "(MISSING)",
                "version": "8.0.32"
            },
            "10.xx.23:3306": {
                "address": "10.xx.23:3306",
                "memberRole": "SECONDARY",
                "mode": "R/O",
                "readReplicas": {},
                "replicationLag": "83:03:51.513132",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.32"
            }
      },
      "topologyMode": "Single-Primary"
    },
    "groupInformationSourceMember": "10.xx21:3306"
}

2024-11-25T15:37:38.029640+08:00 25 Clone Set Error code: 3873 Saved Error code: 0
2024-11-25T15:37:38.029956+08:00 25 Clone Set Error code: 3873 Saved Error code: 3873
2024-11-25T15:37:38.029970+08:00 25 Clone Apply set error code: 3873: Clone estimated database size is 1.52 TiB. Available space 423.45 GiB is not enough.
2024-11-25T15:37:38.029983+08:00 25 Plugin Clone reported: 'Client: Wait for remote after local issue: error: 3873: Clone estimated database size is 1.52 TiB. Available space 423.45 GiB is not enough..'
2024-11-25T15:37:38.035478+08:00 25 Plugin Clone reported: 'Client: Command COM_EXECUTE: error: 3873: Clone estimated database size is 1.52 TiB. Available space 423.45 GiB is not enough..'
2024-11-25T15:37:38.035710+08:00 25 Plugin Clone reported: 'Client: Master ACK COM_EXIT.'
2024-11-25T15:37:38.036228+08:00 25 Plugin Clone reported: 'Client: Master ACK Disconnect : abort: false.'
2024-11-25T15:37:38.038254+08:00 25 Plugin Clone reported: 'Client: Task COM_EXIT.'
2024-11-25T15:37:38.038791+08:00 25 Plugin Clone reported: 'Client: Task Disconnect : abort: false.'
2024-11-25T15:37:38.038818+08:00 25 Clone Set Error code: 3873 Saved Error code: 3873
2024-11-25T15:37:38.038831+08:00 25 Clone Apply set error code: 3873: Clone estimated database size is 1.52 TiB. Available space 423.45 GiB is not enough.
2024-11-25T15:37:38.038846+08:00 25 Clone Set Error code: 3873 Saved Error code: 3873
2024-11-25T15:37:38.039054+08:00 25 Clone Apply End Master Task ID: 0 Failed, code: 3873: Clone estimated database size is 1.52 TiB. Available space 423.45 GiB is not enough.
2024-11-25T15:37:38.039794+08:00 25 Plugin group_replication reported: 'Internal query: CLONE INSTANCE FROM 'mysql_innodb_cluster_330612022'@'10.xx21':3306 IDENTIFIED BY '*****' REQUIRE SSL; result in error. Error number: 3873'
2024-11-25T15:37:38.039857+08:00 24 Plugin group_replication reported: 'There was an issue when cloning from another server: Error number: 3873 Error message: Clone estimated database size is 1.52 TiB. Available space 423.45 GiB is not enough.'
2024-11-25T15:37:38.040028+08:00 24 Plugin group_replication reported: 'Setting super_read_only=ON.'
2024-11-25T15:37:38.040406+08:00 24 Plugin group_replication reported: 'Due to a critical cloning error or lack of donors, distributed recovery cannot be executed. The member will now leave the group.'
2024-11-25T15:37:38.040532+08:00 24 Plugin group_replication reported: 'Going to wait for view modification'
2024-11-25T15:37:38.040660+08:00 0 Plugin group_replication reported: ' xcom_client_remove_node: Try to push xcom_client_remove_node to XCom'
2024-11-25T15:37:38.043020+08:00 0 Plugin group_replication reported: ' new_site_def, new:0x7f7ce3a46000'
2024-11-25T15:37:38.043067+08:00 0 Plugin group_replication reported: ' clone_site_def, new:0x7f7ce3a46000,old site:0x7f7ce183dc00'
2024-11-25T15:37:38.043099+08:00 0 Plugin group_replication reported: ' remove_site_def n:1, site:0x7f7ce3a46000'
2024-11-25T15:37:38.043124+08:00 0 Plugin group_replication reported: ' handle_remove_node calls site_install_action, nodes:1, node number:2'
2024-11-25T15:37:38.043771+08:00 0 Plugin group_replication reported: ' update_servers is called, max nodes:2'
2024-11-25T15:37:38.043789+08:00 0 Plugin group_replication reported: ' Updating physical connections to other servers'
2024-11-25T15:37:38.043804+08:00 0 Plugin group_replication reported: ' Using existing server node 0 host 10.xx21:33061'
2024-11-25T15:37:38.043839+08:00 0 Plugin group_replication reported: ' Using existing server node 1 host 10.xx23:33061'
2024-11-25T15:37:38.043865+08:00 0 Plugin group_replication reported: ' Sucessfully installed new site definition. Start synode for this configuration is {db4f9753 1616104548 2}, boot key synode is {db4f9753 1616104537 2}, configured event horizon=10, my node identifier is 4294967295'
2024-11-25T15:37:38.043937+08:00 0 Plugin group_replication reported: ' free_site_def site:0x7f7ce183a000, x.msgno:1616101646, x.node:2'
2024-11-25T15:37:38.970874+08:00 13 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.
2024-11-25T15:37:41.050650+08:00 0 Plugin group_replication reported: ' 1732520261.050511 pid 929215 xcom_id 950e85b2 state xcom_fsm_run action x_fsm_terminate'
2024-11-25T15:37:41.050807+08:00 0 Plugin group_replication reported: ' set CON_NULL for fd:52 in close_connection'
2024-11-25T15:37:41.051015+08:00 0 Plugin group_replication reported: ' set CON_NULL for fd:53 in close_connection'
2024-11-25T15:37:41.051154+08:00 0 Plugin group_replication reported: ' 1732520261.051141 pid 929215 xcom_id 3c6befc0 state xcom_fsm_start action x_fsm_exit'
2024-11-25T15:37:41.051183+08:00 0 Plugin group_replication reported: ' Exiting xcom thread'
2024-11-25T15:37:41.051198+08:00 0 Plugin group_replication reported: ' terminate_and_exit calls here'
2024-11-25T15:37:41.051213+08:00 0 Plugin group_replication reported: ' cb_xcom_expel is called'
2024-11-25T15:37:41.051262+08:00 0 Plugin group_replication reported: ' set CON_NULL for fd:51 in close_connection'
2024-11-25T15:37:41.051413+08:00 0 Plugin group_replication reported: ' set CON_NULL for fd:50 in close_connection'
2024-11-25T15:37:41.860005+08:00 0 Plugin group_replication reported: ' Installing leave view.'
2024-11-25T15:37:41.860108+08:00 0 Plugin group_replication reported: ' ::install_view():: No exchanged data'
2024-11-25T15:37:41.860123+08:00 0 Plugin group_replication reported: 'on_view_changed is called'
2024-11-25T15:37:41.860214+08:00 0 Plugin group_replication reported: 'Group membership changed: This member has left the group.'

yejr 发表于 2024-11-25 20:08:14

edgar_mu 发表于 2024-11-25 19:26
数据节点已经有1T数据了,仲裁节点要我1T的磁盘,是设计如此吗?

前面说了,仲裁节点不需要存储用户数据,也不需要存储binlog和转存relay log,具体参考 https://greatsql.cn/docs/8.0.32-26/5-enhance/5-2-ha-mgr-arbitrator.html

如果您正在尝试仲裁节点遇到问题,可以微信联系小助手,我们可以在线技术支持

yejr 发表于 2024-11-25 20:12:04

edgar_mu 发表于 2024-11-25 19:27
2024-11-25T15:37:38.029970+08:00 25 Clone Apply set error code: 3873: Clone es ...

在仲裁节点上提前设置 `group_replication_arbitrator = 1` 后,在 Primary 和 Secondary 节点已启动的情况下,可以在仲裁节点手动执行 `START GROUP_REPLICATION` 直接加入 MGR,无需通过 Shell 加入。

如果用 Shell 加入,是有个事务判断及执行 Clone 的过程,可以参考这篇帖子提供的方法略过 https://greatsql.cn/thread-502-1-1.html

edgar_mu 发表于 2024-11-26 14:53:46

yejr 发表于 2024-11-25 20:12
在仲裁节点上提前设置 `group_replication_arbitrator = 1` 后,在 Primary 和 Secondary 节点已启动的情 ...

收到,谢谢,我先试试
页: [1]
查看完整版本: 添加仲裁节点失败