无需改写SQL和加索引，GreatSQL Rapid引擎一招优化提升1300多倍

yejr 发表于 2024-4-16 15:13:20

原始案例：记一个优化SQL (qq.com)

在上述案例中，我在复现过程中，两个测试表均填充10万条记录，执行计划如下

*************************** 1. row ***************************
      id: 1
select_type: PRIMARY
   table: t
partitions: NULL
      type: ALL
possible_keys: NULL
      key: NULL
   key_len: NULL
      ref: NULL
      rows: 100010
filtered: 100.00
   Extra: NULL
*************************** 2. row ***************************
      id: 2
select_type: DEPENDENT SUBQUERY
   table: s
partitions: NULL
      type: ref
possible_keys: idx_de3_n1
      key: idx_de3_n1
   key_len: 19
      ref: sbtest.t.dept_no
      rows: 594
filtered: 11.11
   Extra: Using index condition; Using where

| -> Table scan on t(cost=9934.85 rows=98626) (actual time=0.028..30.044 rows=100000 loops=1)
-> Select #2 (subquery in projection; dependent)
-> Aggregate: count(s.to_date)(cost=160.68 rows=1) (actual time=1.159..1.159 rows=1 loops=100000)
   -> Filter: ((s.from_date >= '1900-01-03') and (s.from_date <= '1985-01-01'))(cost=154.13 rows=66) (actual time=1.159..1.159 rows=0 loops=100000)
         -> Index lookup on s using idx_de3_n1 (dept_no=t.dept_no), with index condition: (s.dept_no = t.dept_no)(cost=154.13 rows=590) (actual time=0.160..1.103 rows=676 loops=100000)

这条SQL在测试环境中耗时 113秒，详情见下：

# Query_time: 113.026347Lock_time: 0.000002 Rows_sent: 100000Rows_examined: 67747561 Thread_id: 4355 Errno: 0 Killed: 0 Bytes_received: 189 Bytes_sent: 4935625 Read_first: 1 Read_last: 0 Read_key: 100001 Read_next: 67647561 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 100001 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2024-04-16T14:57:47.921731+08:00 End: 2024-04-16T14:59:40.948078+08:00 Schema: sbtest Rows_affected: 0
# Tmp_tables: 0Tmp_disk_tables: 0Tmp_table_sizes: 0
# InnoDB_trx_id: 0
# Full_scan: YesFull_join: NoTmp_table: NoTmp_table_on_disk: No
# Filesort: NoFilesort_on_disk: NoMerge_passes: 0
# InnoDB_IO_r_ops: 0InnoDB_IO_r_bytes: 0InnoDB_IO_r_wait: 0.000000
# InnoDB_rec_lock_wait: 0.000000InnoDB_queue_wait: 0.000000
# InnoDB_pages_distinct: 873
SET timestamp=1713250455;
select *,(select count(to_date) from dept_emp5 s force index(idx_de3_n1) where s.dept_no = t.dept_no and from_date >='1900-01-03'and from_date <= '1985-01-01') s from t_group t;

由于t_group表需要全表扫描所以特别慢，可以看到总共需要扫描 67747561 行记录，但只返回10万条记录，所以特别慢。

这个案例我改成用Rapid引擎来跑，同样环境下，只需要耗时 0.08秒，性能提升1366倍，在这个过程中完全无需改写SQL、调整优化参数等额外动作，详情见下：

# Query_time: 0.082696Lock_time: 0.000002 Rows_sent: 100000Rows_examined: 0 Thread_id: 4355 Errno: 0 Killed: 0 Bytes_received: 189 Bytes_sent: 4935523 Read_first: 0 Read_last: 0 Read_key: 0 Read_next: 0 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2024-04-16T15:02:37.423297+08:00 End: 2024-04-16T15:02:37.505993+08:00 Schema: sbtest Rows_affected: 0
# Tmp_tables: 0Tmp_disk_tables: 0Tmp_table_sizes: 0
# Full_scan: YesFull_join: YesTmp_table: NoTmp_table_on_disk: No
# Filesort: NoFilesort_on_disk: NoMerge_passes: 0

另一个SQL优化案例参考：SQL优化案例解析：MINUS改写为标量子查询后提升5倍，但还可以再快近百倍 (qq.com)。

关于GreatSQL Rapid引擎详情参考：Rapid引擎（Rapid Engine） | GreatSQL用户手册。

yejr 发表于 2024-4-16 16:03:56

这是另一个优化案例：SQL 优化案例_(1) (qq.com)

在本案中，我填充了100万行测试数据，下面是对比结果

原始SQL运行了387秒还没结束，被我CTRL+C强制终止：

# Query_time: 387.640089Lock_time: 0.000002 Rows_sent: 0Rows_examined: 1755869365 Thread_id: 4355 Errno: 1317 Killed: 0 Bytes_received: 663 Bytes_sent: 266 Read_first: 1757 Read_last: 0 Read_key: 1757 Read_next: 0 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 1755871120 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 1 Start: 2024-04-16T15:39:48.025906+08:00 End: 2024-04-16T15:46:15.665995+08:00 Schema: sbtest Rows_affected: 0
# Tmp_tables: 1Tmp_disk_tables: 0Tmp_table_sizes: 85880
# InnoDB_trx_id: 0
# Full_scan: YesFull_join: NoTmp_table: YesTmp_table_on_disk: No
# Filesort: NoFilesort_on_disk: NoMerge_passes: 0
# InnoDB_IO_r_ops: 0InnoDB_IO_r_bytes: 0InnoDB_IO_r_wait: 0.000000
# InnoDB_rec_lock_wait: 0.000000InnoDB_queue_wait: 0.000000
# InnoDB_pages_distinct: 7485
SET timestamp=1713253188;
select S.syctime_day,
(select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'CDE',
(select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'SDE',
(select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'PDE',
(select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'CSE',
(select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'SSE'
from t1 S where S.syctime_day > '2024-01-01' and S.syctime_day < '2024-08-01'
GROUP BY S.syctime_day order by S.syctime_day asc;

改用Rapid引擎（无需改写SQL、无需添加索引等额外工作）后，秒出结果，详情如下：

# Query_time: 0.022815Lock_time: 0.000003 Rows_sent: 1Rows_examined: 0 Thread_id: 4355 Errno: 0 Killed: 0 Bytes_received: 662 Bytes_sent: 223 Read_first: 0 Read_last: 0 Read_key: 0 Read_next: 0 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2024-04-16T15:57:40.723431+08:00 End: 2024-04-16T15:57:40.746246+08:00 Schema: sbtest Rows_affected: 0
# Tmp_tables: 0Tmp_disk_tables: 0Tmp_table_sizes: 0
# Full_scan: YesFull_join: YesTmp_table: NoTmp_table_on_disk: No
# Filesort: NoFilesort_on_disk: NoMerge_passes: 0
# No InnoDB statistics available for this query
SET timestamp=1713254260;
select S.syctime_day,
(select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'CDE',
(select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'SDE',
(select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'PDE',
(select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'CSE',
(select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'SSE'
from t2 S where S.syctime_day > '2024-01-01' and S.syctime_day < '2024-08-01'

效率提升不止数万倍。

以上。

yejr 发表于 2024-4-17 10:39:40

传统SQL优化案例：GreatSQL优化技巧：半连接(semijoin)优化，在这个案例中，如果采用GreatSQL Rapid引擎，则相比原始SQL提升120倍，相比优化后的SQL提升18倍

页: [1]

GreatSQL社区's Archiver

无需改写SQL和加索引，GreatSQL Rapid引擎一招优化提升1300多倍