opencn 发表于 2025-11-5 14:12:53

Docker 部署MGR仲裁节点内存占用的问题

本帖最后由 opencn 于 2025-11-5 14:16 编辑

【环境】
GreatSQL 版本:8.0.32
Ubuntu版本:24.0.4
物理机:8核16G

【部署】
Docker部署MGR集群,三个节点,两个数据节点分别部署在A、B两台机器,一个仲裁节点在A、B两台机器漂移
Docker部署时限制数据节点内存占用12G,仲裁节点2G

【问题】
仲裁节点不存储数据,为何内存占用会达到1.8G,导致物理内存占用过大,引起仲裁节点重启

yejr 发表于 2025-11-5 15:13:04

请先提供仲裁节点配置文件内容,以及仲裁节点中mysqld进程内存使用情况(执行 ps -eo pid,size,rss,cmd|grep mysqld 获取)

opencn 发表于 2025-11-5 15:27:02

yejr 发表于 2025-11-5 15:13
请先提供仲裁节点配置文件内容,以及仲裁节点中mysqld进程内存使用情况(执行 ps -eo pid,size,rss,cmd|gre ...

#
# my.cnf example for GreatSQL 8.0.32-27
#
# 下面参数选项设置仅作为参考
#

socket    = /data/GreatSQL/db.sock


loose-skip-binary-as-hex
prompt = "(\\D)[\\u@GreatSQL][\\d]> "
no-auto-rehash


user    = mysql
port    = 3306
server_id = 99
basedir = /usr/local/GreatSQL
datadir    = /data/GreatSQL
socket    = /data/GreatSQL/db.sock
pid-file = mysql.pid
character-set-server = UTF8MB4
skip_name_resolve = ON
default_time_zone = "+8:00"
bind_address = "0.0.0.0"
secure_file_priv = /data/GreatSQL
lower_case_table_names = 0
default_authentication_plugin=mysql_native_password

# Performance
lock_wait_timeout = 3600
open_files_limit    = 65535
back_log = 1024
max_connections = 512
max_connect_errors = 1000000
table_open_cache = 1024
table_definition_cache = 1024
sort_buffer_size = 4M
join_buffer_size = 4M
read_buffer_size = 8M
read_rnd_buffer_size = 4M
bulk_insert_buffer_size = 64M
thread_cache_size = 768
interactive_timeout = 600
wait_timeout = 600
tmp_table_size = 32M
max_heap_table_size = 32M
max_allowed_packet = 64M
net_buffer_shrink_interval = 180
sql_generate_invisible_primary_key = ON
loose-lock_ddl_polling_mode = ON
loose-lock_ddl_polling_runtime = 200

# Logs
log_timestamps = SYSTEM
# log_error = error.log
log_error_verbosity = 3
slow_query_log = ON
log_slow_extra = ON
slow_query_log_file = slow.log
long_query_time = 0.01
log_queries_not_using_indexes = ON
log_throttle_queries_not_using_indexes = 60
min_examined_row_limit = 100
log_slow_admin_statements = ON
log_slow_replica_statements = ON
log_slow_verbosity = FULL
log_bin = binlog
binlog_format = ROW
sync_binlog = 1
binlog_cache_size = 4M
max_binlog_cache_size = 6G
max_binlog_size = 1G
binlog_space_limit = 500G
binlog_rows_query_log_events = ON
binlog_expire_logs_seconds = 604800
binlog_checksum = CRC32
gtid_mode = ON
enforce_gtid_consistency = ON

# Replication
relay-log = relaylog
relay_log_recovery = ON
replica_parallel_type = LOGICAL_CLOCK
replica_parallel_workers = 4
binlog_transaction_dependency_tracking = WRITESET
replica_preserve_commit_order = ON
replica_checkpoint_period = 2
loose-rpl_read_binlog_speed_limit = 100

# Disalbe InnoDB PQ
loose-force_parallel_execute = OFF

# Parallel LOAD DATA
loose-gdb_parallel_load = ON
loose-innodb_optimize_no_pk_parallel_load = ON

# Rapid
#loose-plugin_load_add = 'ha_rapid.so'
loose-rapid_memory_limit = 128M
loose-rapid_worker_threads = 4
loose-rapid_hash_table_memory_limit = 10
loose-secondary_engine_parallel_load_workers = 4

# Clone
loose-plugin_load_add = 'mysql_clone.so'

# MGR
loose-plugin_load_add = 'group_replication.so'
loose-group_replication_group_name = 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaa1'
loose-group_replication_view_change_uuid = 'AUTOMATIC'
loose-group_replication_local_address = arbi:33061
loose-group_replication_group_seeds = db1:33061,db2:33061,arbi:33061
loose-group_replication_communication_stack = "XCOM"
loose-group_replication_recovery_use_ssl = OFF
loose-group_replication_ssl_mode = DISABLED
# greatsql启动变量,不可改
loose-group_replication_start_on_boot = START_MGR
# greatsql启动变量,不可改
loose-group_replication_bootstrap_group = BOOTSTRAP_MGR
loose-group_replication_exit_state_action = 'READ_ONLY'
loose-group_replication_flow_control_mode = "DISABLED"
loose-group_replication_single_primary_mode = ON
loose-group_replication_enforce_update_everywhere_checks = OFF
loose-group_replication_majority_after_mode = ON
loose-group_replication_communication_max_message_size = 10M
loose-group_replication_arbitrator = ON
loose-group_replication_single_primary_fast_mode = 1
loose-group_replication_request_time_threshold = 100
loose-group_replication_primary_election_mode = GTID_FIRST
loose-group_replication_unreachable_majority_timeout = 3
loose-group_replication_member_expel_timeout = 3
loose-group_replication_autorejoin_tries = 288
loose-group_replication_recovery_get_public_key = ON
loose-group_replication_donor_threshold = 100
#添加一致性级别
loose-group_replication_consistency = 'BEFORE'

# greatdb_ha
#loose-plugin_load_add = 'greatdb_ha.so'
#loose-greatdb_ha_enable_mgr_vip = OFF
#loose-greatdb_ha_mgr_vip_nic = 'eth0'
#loose-greatdb_ha_mgr_vip_ip = '172.16.0.252'
#loose-greatdb_ha_mgr_vip_mask = '255.255.255.0'
#loose-greatdb_ha_port = 33062
#loose-greatdb_ha_mgr_read_vip_ips = "172.16.0.251,172.16.0.252"
#loose-greatdb_ha_mgr_read_vip_floating_type = "TO_ANOTHER_SECONDARY"
#loose-greatdb_ha_send_arp_packge_times = 5
#loose-greatdb_ha_mgr_exit_primary_kill_connection_mode = OFF
report_host = arbi
report_port = 3306

# InnoDB
innodb_buffer_pool_size = 128M
innodb_buffer_pool_instances = 1
innodb_data_file_path = ibdata1:12M:autoextend
innodb_flush_log_at_trx_commit = 1
innodb_log_buffer_size = 4M
innodb_redo_log_capacity = 32M
innodb_doublewrite_files = 2
innodb_max_undo_log_size = 128M
innodb_io_capacity = 4000
innodb_io_capacity_max = 8000
innodb_open_files = 65534
innodb_flush_method = O_DIRECT
innodb_lru_scan_depth = 4000
innodb_lock_wait_timeout = 10
innodb_rollback_on_timeout = ON
innodb_print_all_deadlocks = ON
innodb_online_alter_log_max_size = 128M
innodb_print_ddl_logs = ON
innodb_status_file = ON
innodb_status_output = OFF
innodb_status_output_locks = ON
innodb_sort_buffer_size = 8M
innodb_adaptive_hash_index = OFF
innodb_numa_interleave = OFF
innodb_spin_wait_delay = 20
innodb_print_lock_wait_timeout_info = ON
kill_idle_transaction = 300

ps -eo pid,size,rss,cmd|grep mysqld
      1 2472472 1948988 mysqld
    113   4281920 grep mysqld

yejr 发表于 2025-11-5 15:40:38

opencn 发表于 2025-11-5 15:27
#
# my.cnf example for GreatSQL 8.0.32-27
#

请继续补充以下信息:
1、执行SELECT EVENT_NAME, SUM_NUMBER_OF_BYTES_ALLOC FROM
performance_schema.memory_summary_global_by_event_name
    ORDER BY SUM_NUMBER_OF_BYTES_ALLOC DESC LIMIT 10;

2、执行 SELECT THREAD_ID, EVENT_NAME, SUM_NUMBER_OF_BYTES_ALLOC FROM
performance_schema.memory_summary_by_thread_by_event_name
    ORDER BY SUM_NUMBER_OF_BYTES_ALLOC DESC LIMIT 20;

3、执行 SHOW PROCESSLIST;

opencn 发表于 2025-11-5 15:55:45

yejr 发表于 2025-11-5 15:40
请继续补充以下信息:
1、执行SELECT EVENT_NAME, SUM_NUMBER_OF_BYTES_ALLOC FROM
PERFORMANCE_SCHE ...
D:\1.png

yejr 发表于 2025-11-5 15:56:08

opencn 发表于 2025-11-5 15:27
#
# my.cnf example for GreatSQL 8.0.32-27
#


我自己尝试了下,一个用docker拉起的仲裁节点,初始占用内存约1.3GB:
# ps -eo pid,size,rss,cmd | grep -i mysqld
      1 1828104 1304484 mysqld

进入GreatSQL查询各模块内存占用情况如下
> SELECT THREAD_ID, EVENT_NAME, SUM_NUMBER_OF_BYTES_ALLOC FROM   performance_schema.memory_summary_by_thread_by_event_name
ORDER BY SUM_NUMBER_OF_BYTES_ALLOC DESC LIMIT 20;
+-----------+----------------------------------------------+---------------------------+
| THREAD_ID | EVENT_NAME                                 | SUM_NUMBER_OF_BYTES_ALLOC |
+-----------+----------------------------------------------+---------------------------+
|         1 | memory/innodb/memory                         |                   4250608 |
|         1 | memory/sql/dd::String_type                   |                   2061575 |
|      65 | memory/temptable/physical_ram                |                   1048608 |
|      46 | memory/innodb/memory                         |                   1010080 |
|      65 | memory/sql/THD::main_mem_root                |                  794704 |
|      45 | memory/sql/Tsid_map::Node                  |                  614088 |
|      58 | memory/sql/log_error::loaded_services      |                  600352 |
|      53 | memory/innodb/memory                         |                  553096 |
|         1 | memory/sql/THD::main_mem_root                |                  445600 |
|         1 | memory/sql/dd::objects                     |                  429144 |
|         1 | memory/sql/plugin_init_tmp                   |                  326992 |
|         1 | memory/sql/Prepared_statement::main_mem_root |                  298944 |
|         1 | memory/mysqld_openssl/openssl_malloc         |                  294381 |
|      65 | memory/innodb/memory                         |                  272904 |
|         1 | memory/sql/dd::infrastructure                |                  263552 |
|      53 | memory/sql/log_error::loaded_services      |                  222048 |
|      45 | memory/sql/dd::String_type                   |                  194641 |
|      45 | memory/sql/dd::objects                     |                  188160 |
|      45 | memory/innodb/memory                         |                  173368 |
|      54 | memory/mysys/MY_DIR                        |                  172760 |
+-----------+----------------------------------------------+---------------------------+

> SELECT EVENT_NAME, SUM_NUMBER_OF_BYTES_ALLOC FROM
    ->   performance_schema.memory_summary_global_by_event_name
    ->   ORDER BY SUM_NUMBER_OF_BYTES_ALLOC DESC LIMIT 10;
+-----------------------------------------------------------------------------+---------------------------+
| EVENT_NAME                                                                  | SUM_NUMBER_OF_BYTES_ALLOC |
+-----------------------------------------------------------------------------+---------------------------+
| memory/innodb/buf_buf_pool                                                |               137236480 |
| memory/performance_schema/events_statements_summary_by_digest               |                  42240000 |
| memory/sql/dd::String_type                                                |                  34091801 |
| memory/innodb/log_buffer_memory                                             |                  33555440 |
| memory/innodb/ut0link_buf                                                   |                  25165888 |
| memory/performance_schema/events_errors_summary_by_thread_by_error          |                  15765504 |
| memory/performance_schema/events_statements_history_long                  |                  15120000 |
| memory/performance_schema/events_statements_summary_by_thread_by_event_name |                  14598144 |
| memory/innodb/memory                                                      |                  13250696 |
| memory/performance_schema/events_statements_summary_by_digest.digest_text   |                  10240000 |
+-----------------------------------------------------------------------------+---------------------------+

> select * from memory_global_total;
+-----------------+
| total_allocated |
+-----------------+
| 466.80 MiB      |
+-----------------+

这样看起来还是正常的。

也就是说,在GreatSQL内部查看内存分配约占用467MB,但实际上mysqld进程占用内存1.3GB,多出来的那部分内存主要是被活跃连接、innodb log buffer、performan schema、临时表等多个模块所消耗。

如果特别在意这个内存,可以考虑关闭performance_schema看看。

yejr 发表于 2025-11-5 16:03:11

yejr 发表于 2025-11-5 15:56
我自己尝试了下,一个用docker拉起的仲裁节点,初始占用内存约1.3GB:




继续补充,在普通的测试机上部署仲裁节点,用sysbench进行一般性压测,可以看到仲裁节点相对其他节点只要的优势是CPU消耗较低,但因为要频繁参与MGR的状态投票,内存方面的消耗并不是非常低,还是需要消耗一定内存的。

opencn 发表于 2025-11-5 16:03:40

yejr 发表于 2025-11-5 15:56
我自己尝试了下,一个用docker拉起的仲裁节点,初始占用内存约1.3GB:




好的 谢谢,那我把内存加大点,主要是担心内存一直增加,只要有个上限即可

opencn 发表于 2025-11-5 16:21:00

yejr 发表于 2025-11-5 16:03
继续补充,在普通的测试机上部署仲裁节点,用sysbench进行一般性压测,可以看到仲裁节点相对其他节点只要 ...

好的,我们想办法测试出一个阈值

yejr 发表于 2025-11-5 16:21:30

opencn 发表于 2025-11-5 16:21
好的,我们想办法测试出一个阈值

好的,辛苦了,后面也可以把测试结果做个分享哈
页: [1]
查看完整版本: Docker 部署MGR仲裁节点内存占用的问题