GreatSQL社区

搜索

[已解决] 无需改写SQL和加索引,GreatSQL Rapid引擎一招优化提升1300多倍

819 2 2024-4-16 15:13
原始案例:记一个优化SQL (qq.com)


在上述案例中,我在复现过程中,两个测试表均填充10万条记录,执行计划如下


  1. *************************** 1. row ***************************
  2.            id: 1
  3.   select_type: PRIMARY
  4.         table: t
  5.    partitions: NULL
  6.          type: ALL
  7. possible_keys: NULL
  8.           key: NULL
  9.       key_len: NULL
  10.           ref: NULL
  11.          rows: 100010
  12.      filtered: 100.00
  13.         Extra: NULL
  14. *************************** 2. row ***************************
  15.            id: 2
  16.   select_type: DEPENDENT SUBQUERY
  17.         table: s
  18.    partitions: NULL
  19.          type: ref
  20. possible_keys: idx_de3_n1
  21.           key: idx_de3_n1
  22.       key_len: 19
  23.           ref: sbtest.t.dept_no
  24.          rows: 594
  25.      filtered: 11.11
  26.         Extra: Using index condition; Using where




  27. | -> Table scan on t  (cost=9934.85 rows=98626) (actual time=0.028..30.044 rows=100000 loops=1)
  28. -> Select #2 (subquery in projection; dependent)
  29.     -> Aggregate: count(s.to_date)  (cost=160.68 rows=1) (actual time=1.159..1.159 rows=1 loops=100000)
  30.         -> Filter: ((s.from_date >= '1900-01-03') and (s.from_date <= '1985-01-01'))  (cost=154.13 rows=66) (actual time=1.159..1.159 rows=0 loops=100000)
  31.             -> Index lookup on s using idx_de3_n1 (dept_no=t.dept_no), with index condition: (s.dept_no = t.dept_no)  (cost=154.13 rows=590) (actual time=0.160..1.103 rows=676 loops=100000)

复制代码


这条SQL在测试环境中耗时 113秒,详情见下:


  1. # Query_time: 113.026347  Lock_time: 0.000002 Rows_sent: 100000  Rows_examined: 67747561 Thread_id: 4355 Errno: 0 Killed: 0 Bytes_received: 189 Bytes_sent: 4935625 Read_first: 1 Read_last: 0 Read_key: 100001 Read_next: 67647561 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 100001 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2024-04-16T14:57:47.921731+08:00 End: 2024-04-16T14:59:40.948078+08:00 Schema: sbtest Rows_affected: 0
  2. # Tmp_tables: 0  Tmp_disk_tables: 0  Tmp_table_sizes: 0
  3. # InnoDB_trx_id: 0
  4. # Full_scan: Yes  Full_join: No  Tmp_table: No  Tmp_table_on_disk: No
  5. # Filesort: No  Filesort_on_disk: No  Merge_passes: 0
  6. #   InnoDB_IO_r_ops: 0  InnoDB_IO_r_bytes: 0  InnoDB_IO_r_wait: 0.000000
  7. #   InnoDB_rec_lock_wait: 0.000000  InnoDB_queue_wait: 0.000000
  8. #   InnoDB_pages_distinct: 873
  9. SET timestamp=1713250455;
  10. select *  ,(select count(to_date) from dept_emp5 s force index(idx_de3_n1)   where s.dept_no = t.dept_no and from_date >='1900-01-03'  and from_date <= '1985-01-01') s from t_group t;

复制代码


由于t_group表需要全表扫描所以特别慢,可以看到总共需要扫描 67747561 行记录,但只返回10万条记录,所以特别慢。

这个案例我改成用Rapid引擎来跑,同样环境下,只需要耗时 0.08秒,性能提升1366倍,在这个过程中完全无需改写SQL、调整优化参数等额外动作,详情见下:


  1. # Query_time: 0.082696  Lock_time: 0.000002 Rows_sent: 100000  Rows_examined: 0 Thread_id: 4355 Errno: 0 Killed: 0 Bytes_received: 189 Bytes_sent: 4935523 Read_first: 0 Read_last: 0 Read_key: 0 Read_next: 0 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2024-04-16T15:02:37.423297+08:00 End: 2024-04-16T15:02:37.505993+08:00 Schema: sbtest Rows_affected: 0
  2. # Tmp_tables: 0  Tmp_disk_tables: 0  Tmp_table_sizes: 0
  3. # Full_scan: Yes  Full_join: Yes  Tmp_table: No  Tmp_table_on_disk: No
  4. # Filesort: No  Filesort_on_disk: No  Merge_passes: 0

复制代码


另一个SQL优化案例参考:SQL优化案例解析:MINUS改写为标量子查询后提升5倍,但还可以再快近百倍 (qq.com)

关于GreatSQL Rapid引擎详情参考:Rapid引擎(Rapid Engine) | GreatSQL用户手册
全部回复(2)
yejr 2024-4-16 16:03:56
这是另一个优化案例:SQL 优化案例_(1) (qq.com)


在本案中,我填充了100万行测试数据,下面是对比结果

原始SQL运行了387秒还没结束,被我CTRL+C强制终止:


  1. # Query_time: 387.640089  Lock_time: 0.000002 Rows_sent: 0  Rows_examined: 1755869365 Thread_id: 4355 Errno: 1317 Killed: 0 Bytes_received: 663 Bytes_sent: 266 Read_first: 1757 Read_last: 0 Read_key: 1757 Read_next: 0 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 1755871120 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 1 Start: 2024-04-16T15:39:48.025906+08:00 End: 2024-04-16T15:46:15.665995+08:00 Schema: sbtest Rows_affected: 0
  2. # Tmp_tables: 1  Tmp_disk_tables: 0  Tmp_table_sizes: 85880
  3. # InnoDB_trx_id: 0
  4. # Full_scan: Yes  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: No
  5. # Filesort: No  Filesort_on_disk: No  Merge_passes: 0
  6. #   InnoDB_IO_r_ops: 0  InnoDB_IO_r_bytes: 0  InnoDB_IO_r_wait: 0.000000
  7. #   InnoDB_rec_lock_wait: 0.000000  InnoDB_queue_wait: 0.000000
  8. #   InnoDB_pages_distinct: 7485
  9. SET timestamp=1713253188;
  10. select S.syctime_day,
  11. (select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'CDE',
  12. (select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'SDE',
  13. (select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'PDE',
  14. (select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'CSE',
  15. (select count(*) from t1 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'SSE'
  16. from t1 S where S.syctime_day > '2024-01-01' and S.syctime_day < '2024-08-01'
  17. GROUP BY S.syctime_day order by S.syctime_day asc;

复制代码


改用Rapid引擎(无需改写SQL、无需添加索引等额外工作)后,秒出结果,详情如下:


  1. # Query_time: 0.022815  Lock_time: 0.000003 Rows_sent: 1  Rows_examined: 0 Thread_id: 4355 Errno: 0 Killed: 0 Bytes_received: 662 Bytes_sent: 223 Read_first: 0 Read_last: 0 Read_key: 0 Read_next: 0 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 0 Sort_scan_count: 0 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2024-04-16T15:57:40.723431+08:00 End: 2024-04-16T15:57:40.746246+08:00 Schema: sbtest Rows_affected: 0
  2. # Tmp_tables: 0  Tmp_disk_tables: 0  Tmp_table_sizes: 0
  3. # Full_scan: Yes  Full_join: Yes  Tmp_table: No  Tmp_table_on_disk: No
  4. # Filesort: No  Filesort_on_disk: No  Merge_passes: 0
  5. # No InnoDB statistics available for this query
  6. SET timestamp=1713254260;
  7. select S.syctime_day,
  8. (select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'CDE',
  9. (select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'SDE',
  10. (select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'PDE',
  11. (select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'CSE',
  12. (select count(*) from t2 SS where SS.syctime_day = S.syctime_day and SS.o_source = 'CDE') as 'SSE'
  13. from t2 S where S.syctime_day > '2024-01-01' and S.syctime_day < '2024-08-01'

复制代码

效率提升不止数万倍。

以上。

yejr 2024-4-17 10:39:40
传统SQL优化案例:GreatSQL优化技巧:半连接(semijoin)优化,在这个案例中,如果采用GreatSQL Rapid引擎,则相比原始SQL提升120倍,相比优化后的SQL提升18倍
yejr

38

主题

1

博客

1324

贡献

管理员

Rank: 9Rank: 9Rank: 9

金币
4992
贡献
1324

合作电话:010-64087828

社区邮箱:greatsql@greatdb.com

社区公众号
社区小助手
QQ群
GMT+8, 2025-1-18 16:54 , Processed in 0.022241 second(s), 17 queries , Redis On.
快速回复 返回顶部 返回列表