PostgreSQL 数据库性能 分区表哈希智能算法
1 背景知识
对于两个大分区表的Hash 连接,性能与hash table的大小直接相关。
1、如果hash table数据量能够被work_mem缓存,那性能是最优的。
2、如果能够利用分区的特点按分区进行hash 链接,那对于性能是非常大的提升。
3、PostgreSQL有两个参数 enable_partitionwise_join
, enable_partitionwise_aggregate
用于分区智能连接和聚合。
4、默认这两个参数是 off 的。
5、开启这两个参数,对于采用hash join 或 hash aggregate 的执行方式有性能提升。
以下我们举例来看该参数对于执行计划的影响。
2 环境准备
DROP TABLE t01;
CREATE TABLE t01(id1 integer,name1 text) PARTITION BY HASH(id1) ;
CREATE TABLE t01_1 PARTITION OF t01 FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE t01_2 PARTITION OF t01 FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE t01_3 PARTITION OF t01 FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE t01_4 PARTITION OF t01 FOR VALUES WITH (MODULUS 4, REMAINDER 3);
INSERT INTO t01 SELECT generate_series(1,10000000),'abc'||generate_series(1,10000000);
DROP TABLE t02;
CREATE TABLE t02(id2 integer,name1 text) PARTITION BY HASH(id2) ;
CREATE TABLE t02_1 PARTITION OF t02 FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE t02_2 PARTITION OF t02 FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE t02_3 PARTITION OF t02 FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE t02_4 PARTITION OF t02 FOR VALUES WITH (MODULUS 4, REMAINDER 3);
INSERT INTO t02 SELECT generate_series(1,3000000),'abc'||generate_series(1,3000000);
3 分区智能连接
3.1 关闭分区智能连接
SHOW enable_partitionwise_join ;
//屏幕输出:
enable_partitionwise_join
---------------------------
off
(1 row)
3.2 查看执行计划
1、当前 work_mem 为4M
时。无法缓存所有的 HASH
数据。
2、 Batches: 32
是指 HASH
缓存了32次。都被缓存在临时表中。效率并不高。
EXPLAIN ANALYZE SELECT id1 , count(*) FROM t01 GROUP BY id1;
//屏幕输出:
QUERY PLAN
-------------------------------------------
Finalize Aggregate (cost=1067929074.38..1067929074.39 rows=1 width=8) (actual time=2135.213..2248.708 rows=1 loops=1)
-> Gather (cost=1067929074.16..1067929074.37 rows=2 width=8) (actual time=2133.141..2248.691 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1067928074.16..1067928074.17 rows=1 width=8) (actual time=2113.254..2113.444 rows=1 loops=3)
-> Parallel Hash Join (cost=55477.00..911678074.16 rows=62500000000 width=0) (actual time=1583.112..2074.008 rows=1000000 loops=3)
Hash Cond: (t01.id1 = t02.id2)
-> Parallel Append (cost=0.00..116556.00 rows=4166666 width=4) (actual time=0.087..620.678 rows=3333333 loops=3)
-> Parallel Seq Scan on t01_4 (cost=0.00..23946.36 rows=1042336 width=4) (actual time=0.076..307.641 rows=2501606 loops=1)
-> Parallel Seq Scan on t01_2 (cost=0.00..23934.28 rows=1041828 width=4) (actual time=0.070..302.509 rows=2500388 loops=1)
-> Parallel Seq Scan on t01_1 (cost=0.00..23923.50 rows=1041350 width=4) (actual time=0.058..101.870 rows=833080 loops=3)
-> Parallel Seq Scan on t01_3 (cost=0.00..23918.52 rows=1041152 width=4) (actual time=0.112..309.827 rows=2498765 loops=1)
-> Parallel Hash (cost=34969.00..34969.00 rows=1250000 width=4) (actual time=496.988..496.990 rows=1000000 loops=3)
Buckets: 262144 Batches: 32 Memory Usage: 5760kB
-> Parallel Append (cost=0.00..34969.00 rows=1250000 width=4) (actual time=178.960..354.000 rows=1000000 loops=3)
-> Parallel Seq Scan on t02_4 (cost=0.00..7190.62 rows=312962 width=4) (actual time=197.575..285.326 rows=751110 loops=
1)
-> Parallel Seq Scan on t02_2 (cost=0.00..7188.90 rows=312890 width=4) (actual time=197.273..278.184 rows=750935 loops=
1)
-> Parallel Seq Scan on t02_1 (cost=0.00..7169.82 rows=312082 width=4) (actual time=0.061..30.200 rows=249665 loops=3)
-> Parallel Seq Scan on t02_3 (cost=0.00..7169.66 rows=312066 width=4) (actual time=142.029..217.814 rows=748959 loops=
1)
Planning Time: 0.206 ms
JIT:
Functions: 71
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 3.574 ms, Inlining 152.357 ms, Optimization 234.494 ms, Emission 150.277 ms, Total 540.702 ms
Execution Time: 2249.942 ms
(25 rows)
3.3 开启分区智能连接
SET enable_partitionwise_join=on;
3.4 查看执行计划
因为hash 分区,分区列的数据类型与分区数量相同,能够保证相同的数据hash到对应的分区。
1、由于单个分区的数据量更少,更有可能使用内存排序。
2、batches : 8
,缓存次数变少了。
EXPLAIN ANALYZE SELECT count(*) FROM t01,t02 WHERE id1=id2;
//屏幕输出:
QUERY PLAN
-----------
Finalize Aggregate (cost=213601.21..213601.22 rows=1 width=8) (actual time=1844.304..1943.383 rows=1 loops=1)
-> Gather (cost=213600.99..213601.20 rows=2 width=8) (actual time=1841.401..1943.349 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=212600.99..212601.00 rows=1 width=8) (actual time=1820.105..1820.114 rows=1 loops=3)
-> Parallel Append (cost=12323.02..209475.99 rows=1250000 width=0) (actual time=867.670..1776.608 rows=1000000 loops=3)
-> Parallel Hash Join (cost=12325.65..50851.78 rows=312962 width=0) (actual time=862.010..1267.260 rows=751110 loops=1)
Hash Cond: (t01_4.id1 = t02_4.id2)
-> Parallel Seq Scan on t01_4 (cost=0.00..23946.36 rows=1042336 width=4) (actual time=0.080..284.806 rows=2501606 loops=1)
-> Parallel Hash (cost=7190.62..7190.62 rows=312962 width=4) (actual time=244.417..244.418 rows=751110 loops=1)
Buckets: 262144 Batches: 8 Memory Usage: 5760kB
-> Parallel Seq Scan on t02_4 (cost=0.00..7190.62 rows=312962 width=4) (actual time=18.441..125.767 rows=751110 loops=1
)
-> Parallel Hash Join (cost=12323.02..50830.87 rows=312890 width=0) (actual time=875.893..1289.311 rows=750935 loops=1)
Hash Cond: (t01_2.id1 = t02_2.id2)
-> Parallel Seq Scan on t01_2 (cost=0.00..23934.28 rows=1041828 width=4) (actual time=0.071..291.676 rows=2500388 loops=1)
-> Parallel Hash (cost=7188.90..7188.90 rows=312890 width=4) (actual time=250.357..250.358 rows=750935 loops=1)
Buckets: 262144 Batches: 8 Memory Usage: 5760kB
-> Parallel Seq Scan on t02_2 (cost=0.00..7188.90 rows=312890 width=4) (actual time=17.412..131.929 rows=750935 loops=1
)
-> Parallel Hash Join (cost=12290.84..50775.75 rows=312082 width=0) (actual time=292.223..440.440 rows=249665 loops=3)
Hash Cond: (t01_1.id1 = t02_1.id2)
-> Parallel Seq Scan on t01_1 (cost=0.00..23923.50 rows=1041350 width=4) (actual time=0.057..98.739 rows=833080 loops=3)
-> Parallel Hash (cost=7169.82..7169.82 rows=312082 width=4) (actual time=72.022..72.022 rows=249665 loops=3)
Buckets: 262144 Batches: 8 Memory Usage: 5728kB
-> Parallel Seq Scan on t02_1 (cost=0.00..7169.82 rows=312082 width=4) (actual time=0.099..34.730 rows=249665 loops=3)
-> Parallel Hash Join (cost=12290.49..50767.60 rows=312066 width=0) (actual time=865.103..1257.990 rows=748959 loops=1)
Hash Cond: (t01_3.id1 = t02_3.id2)
-> Parallel Seq Scan on t01_3 (cost=0.00..23918.52 rows=1041152 width=4) (actual time=0.078..286.559 rows=2498765 loops=1)
-> Parallel Hash (cost=7169.66..7169.66 rows=312066 width=4) (actual time=247.173..247.174 rows=748959 loops=1)
Buckets: 262144 Batches: 8 Memory Usage: 5728kB
-> Parallel Seq Scan on t02_3 (cost=0.00..7169.66 rows=312066 width=4) (actual time=19.626..125.604 rows=748959 loops=1
)
Planning Time: 1.897 ms
JIT:
Functions: 116
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 5.283 ms, Inlining 0.000 ms, Optimization 2.231 ms, Emission 53.717 ms, Total 61.231 ms
Execution Time: 1969.033 ms
(36 rows)
4 分区聚合
SET enable_partitionwise_aggregate=on;
EXPLAIN ANALYZE SELECT count(*) FROM t01,t02 WHERE id1=id2;
//屏幕输出:
postgres=# EXPLAIN ANALYZE SELECT count(*) FROM t01,t02 WHERE id1=id2;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
--------
Finalize Aggregate (cost=207351.87..207351.88 rows=1 width=8) (actual time=1729.958..1832.467 rows=1 loops=1)
-> Gather (cost=52613.09..207351.85 rows=8 width=8) (actual time=1273.784..1832.442 rows=6 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Append (cost=51613.09..206351.05 rows=4 width=8) (actual time=1281.861..1705.475 rows=2 loops=3)
-> Partial Aggregate (cost=51634.18..51634.19 rows=1 width=8) (actual time=1286.699..1286.702 rows=1 loops=1)
-> Parallel Hash Join (cost=12325.65..50851.78 rows=312962 width=0) (actual time=863.081..1252.165 rows=751110 loops=1)
Hash Cond: (t01_3.id1 = t02_3.id2)
-> Parallel Seq Scan on t01_4 t01_3 (cost=0.00..23946.36 rows=1042336 width=4) (actual time=0.072..280.515 rows=2501606 loops
=1)
-> Parallel Hash (cost=7190.62..7190.62 rows=312962 width=4) (actual time=255.924..255.924 rows=751110 loops=1)
Buckets: 262144 Batches: 8 Memory Usage: 5760kB
-> Parallel Seq Scan on t02_4 t02_3 (cost=0.00..7190.62 rows=312962 width=4) (actual time=20.122..137.145 rows=751110 l
oops=1)
-> Partial Aggregate (cost=51613.09..51613.10 rows=1 width=8) (actual time=1285.761..1285.764 rows=1 loops=1)
-> Parallel Hash Join (cost=12323.02..50830.87 rows=312890 width=0) (actual time=868.068..1253.114 rows=750935 loops=1)
Hash Cond: (t01_1.id1 = t02_1.id2)
-> Parallel Seq Scan on t01_2 t01_1 (cost=0.00..23934.28 rows=1041828 width=4) (actual time=0.060..280.214 rows=2500388 loops
=1)
-> Parallel Hash (cost=7188.90..7188.90 rows=312890 width=4) (actual time=260.389..260.389 rows=750935 loops=1)
Buckets: 262144 Batches: 8 Memory Usage: 5760kB
-> Parallel Seq Scan on t02_2 t02_1 (cost=0.00..7188.90 rows=312890 width=4) (actual time=20.181..139.650 rows=750935 l
oops=1)
-> Partial Aggregate (cost=51555.95..51555.96 rows=1 width=8) (actual time=423.604..423.606 rows=1 loops=3)
-> Parallel Hash Join (cost=12290.84..50775.75 rows=312082 width=0) (actual time=287.110..412.499 rows=249665 loops=3)
Hash Cond: (t01.id1 = t02.id2)
-> Parallel Seq Scan on t01_1 t01 (cost=0.00..23923.50 rows=1041350 width=4) (actual time=0.057..96.679 rows=833080 loops=3)
-> Parallel Hash (cost=7169.82..7169.82 rows=312082 width=4) (actual time=71.926..71.927 rows=249665 loops=3)
Buckets: 262144 Batches: 8 Memory Usage: 5760kB
-> Parallel Seq Scan on t02_1 t02 (cost=0.00..7169.82 rows=312082 width=4) (actual time=0.069..35.350 rows=249665 loops
=3)
-> Partial Aggregate (cost=51547.76..51547.77 rows=1 width=8) (actual time=1273.119..1273.121 rows=1 loops=1)
-> Parallel Hash Join (cost=12290.49..50767.60 rows=312066 width=0) (actual time=861.259..1239.197 rows=748959 loops=1)
Hash Cond: (t01_2.id1 = t02_2.id2)
-> Parallel Seq Scan on t01_3 t01_2 (cost=0.00..23918.52 rows=1041152 width=4) (actual time=0.054..283.870 rows=2498765 loops
=1)
-> Parallel Hash (cost=7169.66..7169.66 rows=312066 width=4) (actual time=250.134..250.135 rows=748959 loops=1)
Buckets: 262144 Batches: 8 Memory Usage: 5728kB
-> Parallel Seq Scan on t02_3 t02_2 (cost=0.00..7169.66 rows=312066 width=4) (actual time=20.139..128.932 rows=748959 l
oops=1)
Planning Time: 0.558 ms
JIT:
Functions: 134
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 6.791 ms, Inlining 0.000 ms, Optimization 2.127 ms, Emission 58.685 ms, Total 67.603 ms
Execution Time: 1835.140 ms
(39 rows)
4.1 关闭智能聚合
SHOW enable_partitionwise_aggregate;
//屏幕输出:
enable_partitionwise_aggregate
--------------------------------
off
(1 row)
4.2 开启只能聚合
SET enable_partitionwise_aggregate=on;