GreatSQL社区

搜索

GreatSQL社区

主从复制中定位回放慢涉及的表

GreatSQL社区 已有 42 次阅读2025-2-17 11:06 |系统分类:运维实战

主从复制中定位回放慢涉及的表

一、前提

世界千奇百怪,每个人都有自己独立的思想,有些事情即使你附耳告知,也可能如风般吹过,进而消逝,为了性能为了不延迟,表要加索引嘛,然而在某业务场景,业务表数千张,无索引的表几百张,这些表都是上百万的数据。

二、现象

在 GreatSQL 主从架构中,某天在系统资源充足的情况下,主从突然延迟,而且持续增长,我们通过SHOW PROCESSLISTSHOW SLAVE STATUS 观察是由于回放 DELETE事务造成的,但是 GTID``在不断地增长,不过增长的非常缓慢,但是平时的时候是没有如此缓慢的,我们该如何快速的定位这些回放缓慢的 GTID 的涉及表呢,接下来就在测试环境演示下如何定位。

三、模拟测试

1、主节点创建库

greatsql> CREATE DATABASE fcmark;
Query OK, 1 row affected (0.03 sec)

2、准备数据

$ sysbench --db-driver=mysql --mysql-host=192.168.139.230 --mysql-port=3307 --mysql-user=greatsql --mysql-password=greatsql@2024 --mysql-db=fcmark --table_size=10000000 --tables=5 oltp_write_only  prepare
sysbench 1.1.0-df89d34 (using bundled LuaJIT 2.1.0-beta3)

Creating table 'sbtest1'...
Inserting 10000000 records into 'sbtest1'
Creating a secondary index on 'sbtest1'...
Creating table 'sbtest2'...
Inserting 10000000 records into 'sbtest2'
Creating a secondary index on 'sbtest2'...
Creating table 'sbtest3'...
Inserting 10000000 records into 'sbtest3'
Creating a secondary index on 'sbtest3'...
Creating table 'sbtest4'...
Inserting 10000000 records into 'sbtest4'
Creating a secondary index on 'sbtest4'...
Creating table 'sbtest5'...
Inserting 10000000 records into 'sbtest5'
Creating a secondary index on 'sbtest5'...

3、主节点修改表结构,将表修改为无索引表

greatsql> ALTER TABLE sbtest1 MODIFY id int not null;
Query OK, 10000000 rows affected (3 min 2.62 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest1 DROP key `k_1`;
Query OK, 0 rows affected (0.15 sec)
Records: 0  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest1 DROP PRIMARY KEY;
Query OK, 10000000 rows affected (1 min 50.19 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest2 MODIFY id int not null;
Query OK, 10000000 rows affected (3 min 2.62 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest2 DROP key `k_2`;
Query OK, 0 rows affected (0.15 sec)
Records: 0  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest2 DROP  PRIMARY KEY;
Query OK, 10000000 rows affected (1 min 50.19 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest3 MODIFY id int not null;
Query OK, 10000000 rows affected (3 min 2.72 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest3 DROP key `k_3`;
Query OK, 0 rows affected (0.15 sec)
Records: 0  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest3 DROP PRIMARY KEY;
Query OK, 10000000 rows affected (1 min 52.19 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest4 MODIFY id int not null;
Query OK, 10000000 rows affected (3 min 3.64 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest4 DROP key `k_4`;
Query OK, 0 rows affected (0.15 sec)
Records: 0  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest4 DROP PRIMARY KEY;
Query OK, 10000000 rows affected (1 min 53.19 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest5 MODIFY id int not null;
Query OK, 10000000 rows affected (3 min 5.62 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest5 DROP key `k_5`;
Query OK, 0 rows affected (0.15 sec)
Records: 0  Duplicates: 0  Warnings: 0

greatsql> ALTER TABLE sbtest5 DROP PRIMARY KEY;
Query OK, 10000000 rows affected (1 min 51 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

4、主库操作数据模拟延迟

greatsql> DELETE FROM sbtest1 WHERE k<200;
Query OK, 214 rows affected (41.14 sec)

greatsql> DELETE FROM sbtest2 WHERE k<200;
Query OK, 193 rows affected (41.58 sec)

greatsql> DELETE FROM sbtest3 WHERE k<200;
Query OK, 177 rows affected (40.00 sec)

greatsql> DELETE FROM sbtest4 WHERE k<200;
Query OK, 184 rows affected (39.38 sec)

greatsql> DELETE FROM sbtest5 WHERE k<200;
Query OK, 212 rows affected (37.56 sec)

greatsql> DELETE FROM sbtest1 WHERE k<400;
Query OK, 182 rows affected (39.09 sec)

greatsql> DELETE FROM sbtest2 WHERE k<400;
Query OK, 193 rows affected (38.21 sec)

greatsql> DELETE FROM sbtest3 WHERE k<400; 
Query OK, 215 rows affected (34.45 sec)

greatsql> DELETE FROM sbtest4 WHERE k<400;
Query OK, 219 rows affected (37.45 sec)

greatsql> DELETE FROM sbtest5 WHERE k<400;
Query OK, 224 rows affected (34.63 sec)

greatsql> DELETE FROM sbtest1 WHERE k<600;
Query OK, 185 rows affected (34.11 sec)

greatsql> DELETE FROM sbtest2 WHERE k<600;
Query OK, 245 rows affected (35.99 sec)

greatsql> DELETE FROM sbtest3 WHERE k<600;
Query OK, 173 rows affected (36.08 sec)

greatsql> DELETE FROM sbtest4 WHERE k<600;
Query OK, 230 rows affected (36.18 sec)

greatsql> DELETE FROM sbtest5 WHERE k<600;
Query OK, 186 rows affected (38.00 sec)

等待执行完

5、观察从库:查看延迟及GTID

通过 SHOW PROCESSLIST 观察到回放数据的时候是Applying batch of row changes (delete)

通过SHOW SLAVE STATUS观察到:数据库一直在应用133171 GTID,主库的binlog位置对应的位点是binlog.000064714314051

注意:

  • Relay_Master_Log_File:SQL线程执行的最新事件对应在主库上的binlog日志文件名称
  • Exec_Master_Log_Pos:SQL县城已经读取和执行过程的中继日志,对应主库binlog日志文件的位点
  • Retrieved_Gtid_Set:从库已经接受到的GTID集合(I/O线程)
  • Executed_Gtid_Set:已经被写进binlog的GTID集合(SQL线程)
greatsql> SHOW PROCESSLIST;
+--------+-----------------+----------------------+--------+---------+---------+---------------------------------------------+------------------+------------+-----------+---------------+
| Id     | User            | Host                 | db     | Command | Time    | State                                       | Info             | Time_ms    | Rows_sent | Rows_examined |
+--------+-----------------+----------------------+--------+---------+---------+---------------------------------------------+------------------+------------+-----------+---------------+
|      5 | event_scheduler | localhost            | NULL   | Daemon  | 2356467 | Waiting on empty queue                      | NULL             | 2356467469 |         0 |             0 |
|     17 | system user     | connecting host      | NULL   | Connect | 2356460 | Waiting for source to send event            | NULL             | 2356459985 |         0 |             0 |
|     18 | system user     |                      | NULL   | Query   |      99 | Waiting for dependent transaction to commit | NULL             |      99081 |         0 |             0 |
|     19 | system user     |                      | fcmark | Query   |     697 | Applying batch of row changes (delete)      | NULL             |      32977 |         0 |             0 |
|     20 | system user     |                      | NULL   | Query   |    3244 | Waiting for an event from Coordinator       | NULL             |    3242579 |         0 |             0 |
|     21 | system user     |                      | NULL   | Connect | 2356460 | Waiting for an event from Coordinator       | NULL             | 2356459981 |         0 |             0 |
|     22 | system user     |                      | NULL   | Connect | 2356460 | Waiting for an event from Coordinator       | NULL             | 2356459980 |         0 |             0 |
| 540493 | greatsql         | 172.17.136.93:34298  | NULL   | Sleep   |      83 |                                             | NULL             |      83365 |         0 |             0 |
| 540593 | greatsql         | 172.17.139.230:51514 | NULL   | Query   |       0 | init                                        | show processlist |          0 |         0 |             0 |
| 540660 | greatsql         | 172.17.136.93:46212  | NULL   | Sleep   |     383 |                                             | NULL             |     383304 |         0 |             0 |
| 540730 | greatsql         | 172.17.136.93:51128  | NULL   | Sleep   |     143 |                                             | NULL             |     143319 |         0 |             0 |
| 540758 | greatsql         | 172.17.136.93:53112  | NULL   | Sleep   |      23 |                                             | NULL             |      23205 |         0 |             0 |
| 540763 | greatsql         | 172.17.139.230:52204 | NULL   | Sleep   |       1 |                                             | NULL             |       1246 |         0 |             0 |
+--------+-----------------+----------------------+--------+---------+---------+---------------------------------------------+------------------+------------+-----------+---------------+
13 rows in set (0.00 sec)

greatsql> SHOW SLAVE STATUS \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for source to send event
                  Master_Host: 172.17.139.230
                  Master_User: greatsql
                  Master_Port: 3307
                Connect_Retry: 60
              Master_Log_File: binlog.000064
          Read_Master_Log_Pos: 714784459
               Relay_Log_File: relaylog.000189
                Relay_Log_Pos: 714314261
        Relay_Master_Log_File: binlog.000064
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 714314051
              Relay_Log_Space: 714784956
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 700
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1000403307
                  Master_UUID: 1d0963e9-85d9-11ef-80e1-00163e28e06a
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Waiting for dependent transaction to commit
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 1d0963e9-85d9-11ef-80e1-00163e28e06a:1-133182
            Executed_Gtid_Set: 1d0963e9-85d9-11ef-80e1-00163e28e06a:1-133170
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
       Master_public_key_path: 
        Get_master_public_key: 0
            Network_Namespace: 
1 row in set, 1 warning (0.00 sec)

greatsql> SHOW SLAVE STATUS \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for source to send event
                  Master_Host: 172.17.139.230
                  Master_User: greatsql
                  Master_Port: 3307
                Connect_Retry: 60
              Master_Log_File: binlog.000064
          Read_Master_Log_Pos: 714784459
               Relay_Log_File: relaylog.000189
                Relay_Log_Pos: 714314261
        Relay_Master_Log_File: binlog.000064
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 714314051
              Relay_Log_Space: 714784956
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 704
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1000403307
                  Master_UUID: 1d0963e9-85d9-11ef-80e1-00163e28e06a
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Waiting for dependent transaction to commit
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 1d0963e9-85d9-11ef-80e1-00163e28e06a:1-133182
            Executed_Gtid_Set: 1d0963e9-85d9-11ef-80e1-00163e28e06a:1-133170
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
       Master_public_key_path: 
        Get_master_public_key: 0
            Network_Namespace: 
1 row in set, 1 warning (0.00 sec)

6、分析数据

主节点操作,根据从节点观察的信息,在主节点上观察 binlog 信息,回放内容是关于 fcmark.sbtest4``表的多个DELETE事件,观察该表结构是无索引,并且数据量接近 1千万

greatsql> SHOW  BINLOG EVENTS IN 'binlog.000064' FROM  714314051 LIMIT 50;
+---------------+-----------+-------------+------------+-------------+------------------------------------------------------------------------+
| Log_name      | Pos       | Event_type  | Server_id  | End_log_pos | Info                                                                   |
+---------------+-----------+-------------+------------+-------------+------------------------------------------------------------------------+
| binlog.000064 | 714314051 | Gtid        | 1000403307 |   714314130 | SET @@SESSION.GTID_NEXT= '1d0963e9-85d9-11ef-80e1-00163e28e06a:133171' |
| binlog.000064 | 714314130 | Query       | 1000403307 |   714314207 | BEGIN                                                                  |
| binlog.000064 | 714314207 | Table_map   | 1000403307 |   714314274 | table_id: 207 (fcmark.sbtest4)                                         |
| binlog.000064 | 714314274 | Delete_rows | 1000403307 |   714322479 | table_id: 207                                                          |
| binlog.000064 | 714322479 | Delete_rows | 1000403307 |   714330684 | table_id: 207                                                          |
| binlog.000064 | 714330684 | Delete_rows | 1000403307 |   714338889 | table_id: 207                                                          |
| binlog.000064 | 714338889 | Delete_rows | 1000403307 |   714347094 | table_id: 207                                                          |
| binlog.000064 | 714347094 | Delete_rows | 1000403307 |   714349409 | table_id: 207 flags: STMT_END_F                                        |
| binlog.000064 | 714349409 | Xid         | 1000403307 |   714349440 | COMMIT /* xid=6241839 */                                               |
| binlog.000064 | 714349440 | Gtid        | 1000403307 |   714349519 | SET @@SESSION.GTID_NEXT= '1d0963e9-85d9-11ef-80e1-00163e28e06a:133172' |
| binlog.000064 | 714349519 | Query       | 1000403307 |   714349596 | BEGIN                                                                  |
| binlog.000064 | 714349596 | Table_map   | 1000403307 |   714349663 | table_id: 210 (fcmark.sbtest5)                                         |
| binlog.000064 | 714349663 | Delete_rows | 1000403307 |   714357868 | table_id: 210                                                          |
| binlog.000064 | 714357868 | Delete_rows | 1000403307 |   714366073 | table_id: 210                                                          |
| binlog.000064 | 714366073 | Delete_rows | 1000403307 |   714374278 | table_id: 210                                                          |
| binlog.000064 | 714374278 | Delete_rows | 1000403307 |   714382483 | table_id: 210                                                          |
| binlog.000064 | 714382483 | Delete_rows | 1000403307 |   714390118 | table_id: 210 flags: STMT_END_F                                        |
| binlog.000064 | 714390118 | Xid         | 1000403307 |   714390149 | COMMIT /* xid=6241935 */                                               |
| binlog.000064 | 714390149 | Gtid        | 1000403307 |   714390228 | SET @@SESSION.GTID_NEXT= '1d0963e9-85d9-11ef-80e1-00163e28e06a:133173' |
| binlog.000064 | 714390228 | Query       | 1000403307 |   714390305 | BEGIN                                                                  |
| binlog.000064 | 714390305 | Table_map   | 1000403307 |   714390372 | table_id: 198 (fcmark.sbtest1)                                         |
| binlog.000064 | 714390372 | Delete_rows | 1000403307 |   714398577 | table_id: 198                                                          |
| binlog.000064 | 714398577 | Delete_rows | 1000403307 |   714406782 | table_id: 198                                                          |
| binlog.000064 | 714406782 | Delete_rows | 1000403307 |   714414987 | table_id: 198                                                          |
| binlog.000064 | 714414987 | Delete_rows | 1000403307 |   714423192 | table_id: 198                                                          |
| binlog.000064 | 714423192 | Delete_rows | 1000403307 |   714425127 | table_id: 198 flags: STMT_END_F                                        |
| binlog.000064 | 714425127 | Xid         | 1000403307 |   714425158 | COMMIT /* xid=6242038 */                                               |
| binlog.000064 | 714425158 | Gtid        | 1000403307 |   714425237 | SET @@SESSION.GTID_NEXT= '1d0963e9-85d9-11ef-80e1-00163e28e06a:133174' |
| binlog.000064 | 714425237 | Query       | 1000403307 |   714425314 | BEGIN                                                                  |
| binlog.000064 | 714425314 | Table_map   | 1000403307 |   714425381 | table_id: 201 (fcmark.sbtest2)                                         |
| binlog.000064 | 714425381 | Delete_rows | 1000403307 |   714433586 | table_id: 201                                                          |
| binlog.000064 | 714433586 | Delete_rows | 1000403307 |   714441791 | table_id: 201                                                          |
| binlog.000064 | 714441791 | Delete_rows | 1000403307 |   714449996 | table_id: 201                                                          |
| binlog.000064 | 714449996 | Delete_rows | 1000403307 |   714458201 | table_id: 201                                                          |
| binlog.000064 | 714458201 | Delete_rows | 1000403307 |   714462226 | table_id: 201 flags: STMT_END_F                                        |
| binlog.000064 | 714462226 | Xid         | 1000403307 |   714462257 | COMMIT /* xid=6242139 */                                               |
| binlog.000064 | 714462257 | Gtid        | 1000403307 |   714462336 | SET @@SESSION.GTID_NEXT= '1d0963e9-85d9-11ef-80e1-00163e28e06a:133175' |
| binlog.000064 | 714462336 | Query       | 1000403307 |   714462413 | BEGIN                                                                  |
| binlog.000064 | 714462413 | Table_map   | 1000403307 |   714462480 | table_id: 204 (fcmark.sbtest3)                                         |
| binlog.000064 | 714462480 | Delete_rows | 1000403307 |   714470685 | table_id: 204                                                          |
| binlog.000064 | 714470685 | Delete_rows | 1000403307 |   714478890 | table_id: 204                                                          |
| binlog.000064 | 714478890 | Delete_rows | 1000403307 |   714487095 | table_id: 204                                                          |
| binlog.000064 | 714487095 | Delete_rows | 1000403307 |   714495300 | table_id: 204                                                          |
| binlog.000064 | 714495300 | Delete_rows | 1000403307 |   714503505 | table_id: 204 flags: STMT_END_F                                        |
| binlog.000064 | 714503505 | Xid         | 1000403307 |   714503536 | COMMIT /* xid=6242232 */                                               |
| binlog.000064 | 714503536 | Gtid        | 1000403307 |   714503615 | SET @@SESSION.GTID_NEXT= '1d0963e9-85d9-11ef-80e1-00163e28e06a:133176' |
| binlog.000064 | 714503615 | Query       | 1000403307 |   714503692 | BEGIN                                                                  |
| binlog.000064 | 714503692 | Table_map   | 1000403307 |   714503759 | table_id: 207 (fcmark.sbtest4)                                         |
| binlog.000064 | 714503759 | Delete_rows | 1000403307 |   714511964 | table_id: 207                                                          |
| binlog.000064 | 714511964 | Delete_rows | 1000403307 |   714520169 | table_id: 207                                                          |
+---------------+-----------+-------------+------------+-------------+------------------------------------------------------------------------+
50 rows in set (0.00 sec)

greatsql> SHOW CREATE TABLE  fcmark.sbtest4\G
*************************** 1. row ***************************
       Table: sbtest4
Create Table: CREATE TABLE `sbtest4` (
  `id` int NOT NULL,
  `k` int NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT ''
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

greatsql>SELECT  count(1) FROM fcmark.sbtest4;
+----------+
| count(1) |
+----------+
|  9999367 |
+----------+
1 row in set (11.35 sec)

四、总结

使用SHOW PROCESSLISTSHOW SLAVE STATUS观察是数据库正在回放数据,找到正在回放的主库的binlog位点以及正在回放的GTID,在主库上使用SHOW BINLOG EVENTS IN binlog文件 FROM 位点 LIMIT 步长,来找到对应的表结构和事件类型,进而观察表结构,该表无索引且数据量大,所以导致从库回放的时候比较慢,至于处理办法,可以临时KILL 掉回放线程,然后加上索引,在重启回放线程,具体细节就不详细展开了。


评论 (0 个评论)

facelist

您需要登录后才可以评论 登录 | 立即注册

合作电话:010-64087828

社区邮箱:greatsql@greatdb.com

社区公众号
社区小助手
QQ群
GMT+8, 2025-2-21 23:49 , Processed in 0.015523 second(s), 9 queries , Redis On.
返回顶部