PostgreSQL SQL优化针对窗口函数的一个SQL优化案例

1 本章背景知识

窗口函数常常用于分组排序运算中，方便用户实现各种分组需求。
由于窗口函数通常是全表扫描数据，同时还需排序聚集，消耗大量的CPU资源，执行效率较低。

本文主要介绍一例窗口函数的优化案例。

2 环境准备

系统介绍
1、信息系统中存在资讯信息这样一个模块，用于发布一些和业务相关的活动动态。
2、每条资讯信息都有一个所属类型（如科技类的资讯、娱乐类、军事类···）和浏览量字段。
3、官网上需要滚动展示一些热门资讯信息列表（浏览量越大代表越热门），而且每个类别的相关资讯记录至多显示3条，
4、“按照资讯分类分组，取每组的前3条资讯信息列表”。

2.1 创建资讯表

字段	说明
id	主键
title	资讯标题
viewnum	浏览量
info_type_id	资讯类别
code	资讯信息

 CREATE TABLE info(
    id numeric not null primary key ,
    title varchar(100),
    viewnum numeric,
    info_type_id numeric,
    code text 
 );

2.2 创建info_type_id索引

 CREATE INDEX info_infotypeid ON info (info_type_id);

2.3 创建资讯类别表

 CREATE TABLE info_type(
    id numeric  not null PRIMARY KEY,
    name varchar(100) 
 );

2.4 插入100个新闻分类

 
INSERT INTO info_type 
SELECT id, 'TYPE' || lpadtext, 5, '0' 
FROM 
generate_series(1, 100) id;

2.5 插入1000000个新闻

INSERT INTO INFO 
SELECT id, 'TTL' || lpadtext, 20, '0' ) title, ceil(random()*1000000) viewnum, ceil(random()*100) info_type_id , md5(id code 
FROM generate_series(1, 1000000) id;

2.6 收集统计信息

VACUUM ANALYSE info_type,info;

3 使用窗口函数下的性能问题

EXPLAIN (ANALYSE ,BUFFERS )
WITH i AS (SELECT i.*, row_number() over (partition by i.info_type_id order by i.viewnum desc) sn FROM info i)
SELECT * 
FROM info_type t LEFT JOIN i ON i.sn <= 3 
AND i.info_type_id = t.id;
 //屏幕输出：
 QUERY PLAN                                                                  
 -----------------------------------------------------------------------
  Hash Right Join  (cost=211867.09..245279.17 rows=333333 width=97) (actual time=4223.126..6169.377 rows=300 loops=1)
    Hash Cond: (i.info_type_id = t.id)
    Buffers: shared hit=11582 read=1753, temp read=17860 written=17901
    -  Subquery Scan on i  (cost=211863.84..244363.84 rows=333333 width=82) (actual time=4223.080..6168.742 rows=300 loops=1)
          Filter: (i.sn <= 3)
          Rows Removed by Filter: 999700
          Buffers: shared hit=11582 read=1752, temp read=17860 written=17901
          -  WindowAgg  (cost=211863.84..231863.84 rows=1000000 width=82) (actual time=4223.079..6080.518 rows=1000000 loops=1)
                Buffers: shared hit=11582 read=1752, temp read=17860 written=17901
                -  Sort  (cost=211863.84..214363.84 rows=1000000 width=74) (actual time=4223.065..5224.438 rows=1000000 loops=1)
                      Sort Key: i_1.info_type_id, i_1.viewnum DESC
                      Sort Method: external merge  Disk: 84128kB
                      Buffers: shared hit=11582 read=1752, temp read=17860 written=17901
                      -  Seq Scan on info i_1  (cost=0.00..23334.00 rows=1000000 width=74) (actual time=0.006..249.981 rows=1000000 loops=1)
                            Buffers: shared hit=11582 read=1752
    -  Hash  (cost=2.00..2.00 rows=100 width=15) (actual time=0.037..0.037 rows=100 loops=1)
          Buckets: 1024  Batches: 1  Memory Usage: 13kB
          Buffers: shared read=1
          -  Seq Scan on info_type t  (cost=0.00..2.00 rows=100 width=15) (actual time=0.015..0.021 rows=100 loops=1)
                Buffers: shared read=1
  Planning Time: 0.328 ms
  Execution Time: 6182.496 ms
 (22 rows)
 登录后复制

可以看到，这里消耗资源最大的是在 sort 操作上。那么，我们能否避免sort 操作了？使用索引可以避免sort 操作。

4 优化方案一：只取第3名的记录

方法一由于读取了大量数据块，耗时过多。本方法暂时先简化例子，功能要求只需返回每组1条记录。

新的SQL特点，每个类型使用子查询通过info表的info_type_id列的索引，可以避免读取多余的数据。select list的子查询作为计算列，只能返回一个值，所以使用row info 先整合，然后使用 (inf.* 再分解，同时使用 offset 2 limit 1获取第三名的一行记录。

explain (analyse ,buffers )
select id, name, (inf).*
from (select t.*,
             (select row (i.*)::info
              from info i
              where i.info_type_id = t.id
              order by i.viewnum desc
                  offset 2
              limit 1) inf
      from info_type t
     ) t;

                                                                       QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on info_type t  (cost=0.00..6708942.94 rows=100 width=361) (actual time=127.552..10513.868 rows=100 loops=1)
   Buffers: shared hit=3544406 read=3255
   SubPlan 1
     -  Limit  (cost=13417.88..13417.88 rows=1 width=38) (actual time=21.744..21.745 rows=1 loops=100)
           Buffers: shared hit=706280 read=3252
           -  Sort  (cost=13417.87..13442.87 rows=10000 width=38) (actual time=21.740..21.740 rows=3 loops=100)
                 Sort Key: i.viewnum DESC
                 Sort Method: top-N heapsort  Memory: 25kB
                 Buffers: shared hit=706280 read=3252
                 -  Bitmap Heap Scan on info i  (cost=185.93..13288.63 rows=10000 width=38) (actual time=3.985..18.371 rows=10000 loops=100)
                       Recheck Cond: (info_type_id = t.id)
                       Heap Blocks: exact=706728
                       Buffers: shared hit=706280 read=3252
                       -  Bitmap Index Scan on info_infotypeid  (cost=0.00..183.43 rows=10000 width=0) (actual time=2.615..2.615 rows=10000 loops=100)
                             Index Cond: (info_type_id = t.id)
                             Buffers: shared hit=1272 read=1532
   SubPlan 2
     -  Limit  (cost=13417.88..13417.88 rows=1 width=38) (actual time=20.599..20.600 rows=1 loops=100)
           Buffers: shared hit=709529 read=3
           -  Sort  (cost=13417.87..13442.87 rows=10000 width=38) (actual time=20.595..20.595 rows=3 loops=100)
                 Sort Key: i_1.viewnum DESC
                 Sort Method: top-N heapsort  Memory: 25kB
                 Buffers: shared hit=709529 read=3
                 -  Bitmap Heap Scan on info i_1  (cost=185.93..13288.63 rows=10000 width=38) (actual time=3.640..17.373 rows=10000 loops=100)
                       Recheck Cond: (info_type_id = t.id)
                       Heap Blocks: exact=706728
                       Buffers: shared hit=709529 read=3
                       -  Bitmap Index Scan on info_infotypeid  (cost=0.00..183.43 rows=10000 width=0) (actual time=2.291..2.291 rows=10000 loops=100)
                             Index Cond: (info_type_id = t.id)
                             Buffers: shared hit=2801 read=3
   SubPlan 3
     -  Limit  (cost=13417.88..13417.88 rows=1 width=38) (actual time=21.284..21.285 rows=1 loops=100)
           Buffers: shared hit=709532
           -  Sort  (cost=13417.87..13442.87 rows=10000 width=38) (actual time=21.279..21.279 rows=3 loops=100)
                 Sort Key: i_2.viewnum DESC
                 Sort Method: top-N heapsort  Memory: 25kB
                 Buffers: shared hit=709532
                 -  Bitmap Heap Scan on info i_2  (cost=185.93..13288.63 rows=10000 width=38) (actual time=3.609..17.868 rows=10000 loops=100)
                       Recheck Cond: (info_type_id = t.id)
                       Heap Blocks: exact=706728
                       Buffers: shared hit=709532
                       -  Bitmap Index Scan on info_infotypeid  (cost=0.00..183.43 rows=10000 width=0) (actual time=2.267..2.267 rows=10000 loops=100)
                             Index Cond: (info_type_id = t.id)
                             Buffers: shared hit=2804
   SubPlan 4
     -  Limit  (cost=13417.88..13417.88 rows=1 width=38) (actual time=20.763..20.763 rows=1 loops=100)
           Buffers: shared hit=709532
           -  Sort  (cost=13417.87..13442.87 rows=10000 width=38) (actual time=20.759..20.759 rows=3 loops=100)
                 Sort Key: i_3.viewnum DESC
                 Sort Method: top-N heapsort  Memory: 25kB
                 Buffers: shared hit=709532
                 -  Bitmap Heap Scan on info i_3  (cost=185.93..13288.63 rows=10000 width=38) (actual time=3.769..17.505 rows=10000 loops=100)
                       Recheck Cond: (info_type_id = t.id)
                       Heap Blocks: exact=706728
                       Buffers: shared hit=709532
                       -  Bitmap Index Scan on info_infotypeid  (cost=0.00..183.43 rows=10000 width=0) (actual time=2.390..2.390 rows=10000 loops=100)
                             Index Cond: (info_type_id = t.id)
                             Buffers: shared hit=2804
   SubPlan 5
     -  Limit  (cost=13417.88..13417.88 rows=1 width=38) (actual time=20.713..20.713 rows=1 loops=100)
           Buffers: shared hit=709532
           -  Sort  (cost=13417.87..13442.87 rows=10000 width=38) (actual time=20.709..20.709 rows=3 loops=100)
                 Sort Key: i_4.viewnum DESC
                 Sort Method: top-N heapsort  Memory: 25kB
                 Buffers: shared hit=709532
                 -  Bitmap Heap Scan on info i_4  (cost=185.93..13288.63 rows=10000 width=38) (actual time=3.689..17.432 rows=10000 loops=100)
                       Recheck Cond: (info_type_id = t.id)
                       Heap Blocks: exact=706728
                       Buffers: shared hit=709532
                       -  Bitmap Index Scan on info_infotypeid  (cost=0.00..183.43 rows=10000 width=0) (actual time=2.288..2.288 rows=10000 loops=100)
                             Index Cond: (info_type_id = t.id)
                             Buffers: shared hit=2804
 Planning Time: 0.729 ms
 Execution Time: 10514.326 ms
(74 rows)
登录后复制

方法二针对 info_type 的每一行，info 表都要根据 info_type_id 索引访问 info 表 5 次（5个列）。总时间消耗： 100 （行）*5（列）* 20 （每次大概20ms），大约 10000ms。

执行计划分析：根据 info_type_id 索引，需要访问的行数太多，而且还是需要排序。基于这些考虑，我们可以创建个 info_type_id + viewnum 复合索引，减少每访问的时间消耗，避免排序。

5 优化方法二：优化索引

create index info_typeview on info(info_type_id,viewnum);

                                                                        QUERY PLAN                                                                        
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on info_type t  (cost=0.00..4627.72 rows=100 width=361) (actual time=0.255..13.391 rows=100 loops=1)
   Buffers: shared hit=2881 read=120
   SubPlan 1
     -  Limit  (cost=6.31..9.25 rows=1 width=38) (actual time=0.041..0.041 rows=1 loops=100)
           Buffers: shared hit=480 read=120
           -  Index Scan Backward using info_typeview on info i  (cost=0.42..29421.91 rows=10000 width=38) (actual time=0.034..0.040 rows=3 loops=100)
                 Index Cond: (info_type_id = t.id)
                 Buffers: shared hit=480 read=120
   SubPlan 2
     -  Limit  (cost=6.31..9.25 rows=1 width=38) (actual time=0.022..0.022 rows=1 loops=100)
           Buffers: shared hit=600
           -  Index Scan Backward using info_typeview on info i_1  (cost=0.42..29421.91 rows=10000 width=38) (actual time=0.018..0.021 rows=3 loops=100)
                 Index Cond: (info_type_id = t.id)
                 Buffers: shared hit=600
   SubPlan 3
     -  Limit  (cost=6.31..9.25 rows=1 width=38) (actual time=0.021..0.021 rows=1 loops=100)
           Buffers: shared hit=600
           -  Index Scan Backward using info_typeview on info i_2  (cost=0.42..29421.91 rows=10000 width=38) (actual time=0.018..0.020 rows=3 loops=100)
                 Index Cond: (info_type_id = t.id)
                 Buffers: shared hit=600
   SubPlan 4
     -  Limit  (cost=6.31..9.25 rows=1 width=38) (actual time=0.021..0.021 rows=1 loops=100)
           Buffers: shared hit=600
           -  Index Scan Backward using info_typeview on info i_3  (cost=0.42..29421.91 rows=10000 width=38) (actual time=0.018..0.020 rows=3 loops=100)
                 Index Cond: (info_type_id = t.id)
                 Buffers: shared hit=600
   SubPlan 5
     -  Limit  (cost=6.31..9.25 rows=1 width=38) (actual time=0.023..0.023 rows=1 loops=100)
           Buffers: shared hit=600
           -  Index Scan Backward using info_typeview on info i_4  (cost=0.42..29421.91 rows=10000 width=38) (actual time=0.020..0.022 rows=3 loops=100)
                 Index Cond: (info_type_id = t.id)
                 Buffers: shared hit=600
 Planning Time: 0.730 ms
 Execution Time: 13.552 ms
(34 rows)登录后复制

可以看到，创建新索引后，单次的访问从 20ms 降低到 0.023ms ，将近降了 1000 倍。

存在问题：限制了返回行数，仅一行，同时info表有5个列，所以有5个subplan，其中4个是冗余的。

6 优化方法三：使用array，一次返回多行

以下再修改新的SQL，新的SQL特点，select list的子查询作为计算列，只能返回一行值，所以使用array() 先转换成数组类型，然后使用 unnest() 再分解成多行，同时使用 limit 3获取前三名的三行记录。

explain (analyse ,buffers )
select id, name, (inf).*
from (select t.id, t.name, unnest(inf) inf
      from (select t.*,
                   array(select row (i.*)::info
                         from info i
                         where i.info_type_id = t.id
                         order by i.viewnum desc
                         limit 3) inf
            from info_type t
           ) t) t;

                                                                          QUERY PLAN                                                                          
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan on t  (cost=0.00..942.89 rows=1000 width=361) (actual time=0.092..2.526 rows=300 loops=1)
   Buffers: shared hit=601
   -  ProjectSet  (cost=0.00..932.89 rows=1000 width=47) (actual time=0.089..2.406 rows=300 loops=1)
         Buffers: shared hit=601
         -  Seq Scan on info_type t_1  (cost=0.00..2.00 rows=100 width=15) (actual time=0.008..0.020 rows=100 loops=1)
               Buffers: shared hit=1
         SubPlan 1
           -  Limit  (cost=0.42..9.25 rows=3 width=38) (actual time=0.018..0.021 rows=3 loops=100)
                 Buffers: shared hit=600
                 -  Index Scan Backward using info_typeview on info i  (cost=0.42..29421.91 rows=10000 width=38) (actual time=0.017..0.020 rows=3 loops=100)
                       Index Cond: (info_type_id = t_1.id)
                       Buffers: shared hit=600
 Planning Time: 0.295 ms
 Execution Time: 2.639 ms
(14 rows)

7 优化方法四：使用lateral

新的SQL特点，简洁迅速，使用LATERAL子查询，允许它们引用前面的FROM项提供的列。

explain(analyze ,buffers )
select t.*, inf.*
from info_type t
         left join LATERAL (select i.*
                            from info i
                            where i.info_type_id = t.id
                            order by i.viewnum desc offset 0
                            limit 3) inf on true;

                                                                    QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=0.42..1267.72 rows=300 width=88) (actual time=0.026..1.087 rows=264 loops=1)
   Buffers: shared hit=554
   -  Seq Scan on info_type t  (cost=0.00..2.00 rows=100 width=15) (actual time=0.005..0.011 rows=100 loops=1)
         Buffers: shared hit=1
   -  Limit  (cost=0.42..12.60 rows=3 width=73) (actual time=0.007..0.010 rows=2 loops=100)
         Buffers: shared hit=553
         -  Index Scan Backward using info_typeview on info i  (cost=0.42..406.17 rows=100 width=73) (actual time=0.007..0.010 rows=2 loops=100)
               Index Cond: (info_type_id = t.id)
               Buffers: shared hit=553
 Planning Time: 0.114 ms
 Execution Time: 1.114 ms
(11 行记录)登录后复制

8 优化和结论

整个优化关键点是创建了 info_type_id + viewnum 复合索引，也就是窗口查询 partition by 和 order by 两部分列的复合索引。
array 的应用也是关键的地方，解决了需要返回多行的问题。
多个subplan在嵌套了 array 之后，变成 1 个。
适用场景：适用于排序的队列比较长的情景，比如本例：每个info_type_id 有1万条记录，这种情况下，通过索引只需要访问3行。而如果通过窗口函数，即使使用索引，也必须全部排序后再取前3行。