- Mastering PostgreSQL 12
- Hans Jürgen Sch?nig
- 260字
- 2021-08-20 10:00:34
Investigating performance
Grouping sets is a powerful feature; they help to reduce the number of expensive queries. Internally, PostgreSQL will basically use MixedAggregate to perform the aggregation. It can perform many operations at once, which ensures efficiency, as shown in the following example:
test=# explain SELECT region, country, avg(production)
FROM t_oil
WHERE country IN ('USA', 'Canada', 'Iran', 'Oman')
GROUP BY GROUPING SETS ( (), region, country);
QUERY PLAN
--------------------------------------------------------------------
MixedAggregate (cost=0.00..18.17 rows=17 width=52)
Hash Key: region
Hash Key: country
Group Key: ()
-> Seq Scan on t_oil (cost=0.00..15.66 rows=184 width=24)
Filter: (country = ANY ('{USA,Canada,Iran,Oman}'::text[]))
(6 rows)
In older versions of PostgreSQL, the system used GroupAggregate to perform this operation in all cases. In a more modern version, MixedAggregate has been added. However, you can still force the optimizer to use the old strategy using the enable_hashagg setting. MixedAggregate is essentially HashAggregate and therefore the same setting applies, as shown in the next example:
test=# SET enable_hashagg TO off;
SET
test=# explain SELECT region, country, avg(production)
FROM t_oil
WHERE country IN ('USA', 'Canada', 'Iran', 'Oman')
GROUP BY GROUPING SETS ( (), region, country);
QUERY PLAN
--------------------------------------------------------------------------
GroupAggregate (cost=22.58..32.48 rows=17 width=52)
Group Key: region
Group Key: ()
Sort Key: country
Group Key: country
-> Sort (cost=22.58..23.04 rows=184 width=24)
Sort Key: region
-> Seq Scan on t_oil (cost=0.00..15.66 rows=184 width=24)
Filter: (country = ANY ('{USA,Canada,Iran,Oman}'::text[]))
(9 rows)
test=# SET enable_hashagg TO on;
SET
In general, the hash-based version (MixedAggregate) is faster and is favored by the optimizer if there is enough memory to keep the hash needed for MixedAggregate in memory.