yejr 发表于 2024-4-16 15:13:20

无需改写SQL和加索引,GreatSQL Rapid引擎一招优化提升1300多倍

原始案例:记一个优化SQL (qq.com)


在上述案例中,我在复现过程中,两个测试表均填充10万条记录,执行计划如下


*************************** 1. row ***************************
         id: 1
select_type: PRIMARY
      table: t
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 100010
   filtered: 100.00
      Extra: NULL
*************************** 2. row ***************************
         id: 2
select_type: DEPENDENT SUBQUERY
      table: s
   partitions: NULL
         type: ref
possible_keys: idx_de3_n1
          key: idx_de3_n1
      key_len: 19
          ref: sbtest.t.dept_no
         rows: 594
   filtered: 11.11
      Extra: Using index condition; Using where




| -> Table scan on t(cost=9934.85 rows=98626) (actual time=0.028..30.044 rows=100000 loops=1)
-> Select #2 (subquery in projection; dependent)
    -> Aggregate: count(s.to_date)(cost=160.68 rows=1) (actual time=1.159..1.159 rows=1 loops=100000)
      -> Filter: ((s.from_date >= '1900-01-03') and (s.from_date <= '1985-01-01'))(cost=154.13 rows=66) (actual time=1.159..1.159 rows=0 loops=100000)
            -> Index lookup on s using idx_de3_n1 (dept_no=t.dept_no), with index condition: (s.dept_no = t.dept_no)(cost=154.13 rows=590) (actual time=0.160..1.103 rows=676 loops=100000)



这条SQL在测试环境中耗时 113秒,详情见下:


# Query_time: 113.026347Lock_time: 0.000002 Rows_sent: 100000Rows_examined: 67747561 Thread_id: 4355 Errno: 0 Killed: 0 Bytes_received: 189 Bytes_sent: 4935625 Read_first: 1 Read_last: 0 Read_key: 100001 Read_next: 67647561 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 100001 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2024-04-16T14:57:47.921731+08:00 End: 2024-04-16T14:59:40.948078+08:00 Schema: sbtest Rows_affected: 0
# Tmp_tables: 0Tmp_disk_tables: 0Tmp_table_sizes: 0
# InnoDB_trx_id: 0
# Full_scan: YesFull_join: NoTmp_table: NoTmp_table_on_disk: No
# Filesort: NoFilesort_on_disk: NoMerge_passes: 0
#   InnoDB_IO_r_ops: 0InnoDB_IO_r_bytes: 0InnoDB_IO_r_wait: 0.000000
#   InnoDB_rec_lock_wait: 0.000000InnoDB_queue_wait: 0.000000
#   InnoDB_pages_distinct: 873
SET timestamp=1713250455;
select *,(select count(to_date) from dept_emp5 s force index(idx_de3_n1)   where s.dept_no = t.dept_no and from_date >='1900-01-03'and from_date <= '1985-01-01') s from t_group t;



由于t_group表需要全表扫描所以特别慢,可以看到总共需要扫描 67747561 行记录,但只返回10万条记录,所以特别慢。

这个案例我改成用Rapid引擎来跑,同样环境下,只需要耗时 0.08秒,性能提升1366倍,在这个过程中完全无需改写SQL、调整优化参数等额外动作,详情见下:


# Query_time: 0.082696Lock_time: 0.000002 Rows_sent: 100000Rows_examined: 0 Thread_id: 4355 Errno: 0 Killed: 0 Bytes_received: 189 Bytes_sent: 4935523 Read_first: 0 Read_last: 0 Read_key: 0 Read_next: 0 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2024-04-16T15:02:37.423297+08:00 End: 2024-04-16T15:02:37.505993+08:00 Schema: sbtest Rows_affected: 0
# Tmp_tables: 0Tmp_disk_tables: 0Tmp_table_sizes: 0
# Full_scan: YesFull_join: YesTmp_table: NoTmp_table_on_disk: No
# Filesort: NoFilesort_on_disk: NoMerge_passes: 0



另一个SQL优化案例参考:SQL优化案例解析:MINUS改写为标量子查询后提升5倍,但还可以再快近百倍 (qq.com)。

关于GreatSQL Rapid引擎详情参考:Rapid引擎(Rapid Engine) | GreatSQL用户手册。

yejr 发表于 2024-4-16 16:03:56

这是另一个优化案例:SQL 优化案例_(1) (qq.com)


在本案中,我填充了100万行测试数据,下面是对比结果

原始SQL运行了387秒还没结束,被我CTRL+C强制终止:


# Query_time: 387.640089Lock_time: 0.000002 Rows_sent: 0Rows_examined: 1755869365 Thread_id: 4355 Errno: 1317 Killed: 0 Bytes_received: 663 Bytes_sent: 266 Read_first: 1757 Read_last: 0 Read_key: 1757 Read_next: 0 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 1755871120 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 1 Start: 2024-04-16T15:39:48.025906+08:00 End: 2024-04-16T15:46:15.665995+08:00 Schema: sbtest Rows_affected: 0
# Tmp_tables: 1Tmp_disk_tables: 0Tmp_table_sizes: 85880
# InnoDB_trx_id: 0
# Full_scan: YesFull_join: NoTmp_table: YesTmp_table_on_disk: No
# Filesort: NoFilesort_on_disk: NoMerge_passes: 0
#   InnoDB_IO_r_ops: 0InnoDB_IO_r_bytes: 0InnoDB_IO_r_wait: 0.000000
#   InnoDB_rec_lock_wait: 0.000000InnoDB_queue_wait: 0.000000
#   InnoDB_pages_distinct: 7485
SET timestamp=1713253188;
select S.syctime_day,
(select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'CDE',
(select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'SDE',
(select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'PDE',
(select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'CSE',
(select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'SSE'
from t1 S where S.syctime_day > '2024-01-01' and S.syctime_day < '2024-08-01'
GROUP BY S.syctime_day order by S.syctime_day asc;



改用Rapid引擎(无需改写SQL、无需添加索引等额外工作)后,秒出结果,详情如下:


# Query_time: 0.022815Lock_time: 0.000003 Rows_sent: 1Rows_examined: 0 Thread_id: 4355 Errno: 0 Killed: 0 Bytes_received: 662 Bytes_sent: 223 Read_first: 0 Read_last: 0 Read_key: 0 Read_next: 0 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2024-04-16T15:57:40.723431+08:00 End: 2024-04-16T15:57:40.746246+08:00 Schema: sbtest Rows_affected: 0
# Tmp_tables: 0Tmp_disk_tables: 0Tmp_table_sizes: 0
# Full_scan: YesFull_join: YesTmp_table: NoTmp_table_on_disk: No
# Filesort: NoFilesort_on_disk: NoMerge_passes: 0
# No InnoDB statistics available for this query
SET timestamp=1713254260;
select S.syctime_day,
(select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'CDE',
(select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'SDE',
(select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'PDE',
(select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'CSE',
(select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'SSE'
from t2 S where S.syctime_day > '2024-01-01' and S.syctime_day < '2024-08-01'


效率提升不止数万倍。

以上。

yejr 发表于 2024-4-17 10:39:40

传统SQL优化案例:GreatSQL优化技巧:半连接(semijoin)优化,在这个案例中,如果采用GreatSQL Rapid引擎,则相比原始SQL提升120倍,相比优化后的SQL提升18倍
页: [1]
查看完整版本: 无需改写SQL和加索引,GreatSQL Rapid引擎一招优化提升1300多倍