書名： Mastering PostgreSQL 12
作者名： Hans Jürgen Sch?nig
本章字?jǐn)?shù)： 283字
更新時間： 2021-08-20 10:00:26

Clustering tables

In PostgreSQL, there is a command called CLUSTER that allows us to rewrite a table in the desired order. It is possible to point to an index and store data in the same order as the index:

test=# \h CLUSTER
Command: CLUSTER
Description: cluster a table according to an index
Syntax:
CLUSTER [VERBOSE] table_name [ USING index_name ]
CLUSTER [VERBOSE]

URL: https://www.postgresql.org/docs/12/sql-cluster.html

The CLUSTER command has been around for many years and serves its purpose well. However, there are some things to consider before blindly running it on a production system:

The CLUSTER command will lock the table while it is running. You cannot insert or modify data while CLUSTER is running. This might not be acceptable on a production system.
Data can only be organized according to one index. You cannot order a table by zip code, name, ID, birthday, and so on, at the same time. This means that CLUSTER will make sense if there are search criteria that are used most of the time.
Keep in mind that the example outlined in this book is more of a worst-case scenario. In reality, the performance difference between a clustered and a non-clustered table will depend on the workload, the amount of data retrieved, cache hit rates, and a lot more.
The clustered state of a table will not be maintained as changes are made to a table during normal operations. Correlation will usually deteriorate as time goes by.

Here is an example of how to run the CLUSTER command:

test=# CLUSTER t_random USING idx_random; 
CLUSTER

Depending on the size of the table, the time needed to cluster will vary.

官术网_书友最值得收藏!

Mastering PostgreSQL 12

Clustering tables