- Mastering PostgreSQL 12
- Hans Jürgen Sch?nig
- 542字
- 2021-08-20 10:00:35
Understanding the subtle difference between ROWS and RANGE
So far, you have seen sliding windows using OVER ... ROWS. However, there is more. Let's take a look at the SQL specification taken directly from the PostgreSQL documentation:
{ RANGE | ROWS | GROUPS } frame_start [ frame_exclusion ]
{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end [ frame_exclusion ]
There is more than just ROWS. In real life, we have seen that many people are struggling to understand the difference between RANGE and ROWS. In many cases, the result is the same, which adds even more to the confusion. To understand the problem, let's first create some simple data:
test=# SELECT *, x / 3 AS y FROM generate_series(1, 15) AS x;
x | y
----+---
1 | 0
2 | 0
3 | 1
4 | 1
5 | 1
6 | 2
7 | 2
8 | 2
9 | 3
10 | 3
11 | 3
12 | 4
13 | 4
14 | 4
15 | 5
(15 rows)
This is a simple dataset. Be particularly aware of the second column, which contains a couple of duplicates. Those will be relevant in a minute:
test=# SELECT *, x / 3 AS y,
array_agg(x) OVER (ORDER BY x
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS rows_1,
array_agg(x) OVER (ORDER BY x
RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS range_1,
array_agg(x/3) OVER (ORDER BY (x/3)
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS rows_2,
array_agg(x/3) OVER (ORDER BY (x/3)
RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS range_2
FROM generate_series(1, 15) AS x;
x | y | rows_1 | range_1 | rows_2 | range_2
----+---+------------+------------+---------+---------------------
1 | 0 | {1,2} | {1,2} | {0,0} | {0,0,1,1,1}
2 | 0 | {1,2,3} | {1,2,3} | {0,0,1} | {0,0,1,1,1}
3 | 1 | {2,3,4} | {2,3,4} | {0,1,1} | {0,0,1,1,1,2,2,2}
4 | 1 | {3,4,5} | {3,4,5} | {1,1,1} | {0,0,1,1,1,2,2,2}
5 | 1 | {4,5,6} | {4,5,6} | {1,1,2} | {0,0,1,1,1,2,2,2}
6 | 2 | {5,6,7} | {5,6,7} | {1,2,2} | {1,1,1,2,2,2,3,3,3}
7 | 2 | {6,7,8} | {6,7,8} | {2,2,2} | {1,1,1,2,2,2,3,3,3}
8 | 2 | {7,8,9} | {7,8,9} | {2,2,3} | {1,1,1,2,2,2,3,3,3}
9 | 3 | {8,9,10} | {8,9,10} | {2,3,3} | {2,2,2,3,3,3,4,4,4}
10 | 3 | {9,10,11} | {9,10,11} | {3,3,3} | {2,2,2,3,3,3,4,4,4}
11 | 3 | {10,11,12} | {10,11,12} | {3,3,4} | {2,2,2,3,3,3,4,4,4}
12 | 4 | {11,12,13} | {11,12,13} | {3,4,4} | {3,3,3,4,4,4,5}
13 | 4 | {12,13,14} | {12,13,14} | {4,4,4} | {3,3,3,4,4,4,5}
14 | 4 | {13,14,15} | {13,14,15} | {4,4,5} | {3,3,3,4,4,4,5}
15 | 5 | {14,15} | {14,15} | {4,5} | {4,4,4,5}
(15 rows)
After listing the x and y columns, I have applied windowing functions on x. As you can see, the results are the same for both columns. rows_1 and range_1 are absolutely identical. The situation changes if we start to use the column containing those duplicates. In the case of ROWS, PostgreSQL simply takes the previous and the next rows. In the case of RANGE, it takes the entire group of duplicates. Hence, the array is a lot longer. The entire group of identical values is taken.
- JavaScript實例自學(xué)手冊
- Getting Started with Clickteam Fusion
- Hands-On Machine Learning on Google Cloud Platform
- 數(shù)據(jù)中心建設(shè)與管理指南
- PostgreSQL Administration Essentials
- Spark大數(shù)據(jù)技術(shù)與應(yīng)用
- Ceph:Designing and Implementing Scalable Storage Systems
- Red Hat Linux 9實務(wù)自學(xué)手冊
- Applied Data Visualization with R and ggplot2
- Linux Shell Scripting Cookbook(Third Edition)
- 案例解說Delphi典型控制應(yīng)用
- QTP自動化測試實踐
- 歐姆龍PLC應(yīng)用系統(tǒng)設(shè)計實例精解
- Apache Hadoop 3 Quick Start Guide
- 深度剖析:硬盤固件級數(shù)據(jù)恢復(fù)