官术网_书友最值得收藏!

  • Mastering PostgreSQL 12
  • Hans Jürgen Sch?nig
  • 542字
  • 2021-08-20 10:00:35

Understanding the subtle difference between ROWS and RANGE

So far, you have seen sliding windows using OVER ... ROWS. However, there is more. Let's take a look at the SQL specification taken directly from the PostgreSQL documentation:

{ RANGE | ROWS | GROUPS } frame_start [ frame_exclusion ]
{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end [ frame_exclusion ]

There is more than just ROWS. In real life, we have seen that many people are struggling to understand the difference between RANGE and ROWS. In many cases, the result is the same, which adds even more to the confusion. To understand the problem, let's first create some simple data:

test=# SELECT *, x / 3 AS y FROM generate_series(1, 15) AS x;
x | y
----+---
1 | 0
2 | 0
3 | 1
4 | 1
5 | 1
6 | 2
7 | 2
8 | 2
9 | 3
10 | 3
11 | 3
12 | 4
13 | 4
14 | 4
15 | 5
(15 rows)

This is a simple dataset. Be particularly aware of the second column, which contains a couple of duplicates. Those will be relevant in a minute:

test=# SELECT *, x / 3 AS y, 
array_agg(x) OVER (ORDER BY x
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS rows_1,
array_agg(x) OVER (ORDER BY x
RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS range_1,
array_agg(x/3) OVER (ORDER BY (x/3)
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS rows_2,
array_agg(x/3) OVER (ORDER BY (x/3)
RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS range_2
FROM generate_series(1, 15) AS x;
x | y | rows_1 | range_1 | rows_2 | range_2
----+---+------------+------------+---------+---------------------
1 | 0 | {1,2} | {1,2} | {0,0} | {0,0,1,1,1}
2 | 0 | {1,2,3} | {1,2,3} | {0,0,1} | {0,0,1,1,1}
3 | 1 | {2,3,4} | {2,3,4} | {0,1,1} | {0,0,1,1,1,2,2,2}
4 | 1 | {3,4,5} | {3,4,5} | {1,1,1} | {0,0,1,1,1,2,2,2}
5 | 1 | {4,5,6} | {4,5,6} | {1,1,2} | {0,0,1,1,1,2,2,2}
6 | 2 | {5,6,7} | {5,6,7} | {1,2,2} | {1,1,1,2,2,2,3,3,3}
7 | 2 | {6,7,8} | {6,7,8} | {2,2,2} | {1,1,1,2,2,2,3,3,3}
8 | 2 | {7,8,9} | {7,8,9} | {2,2,3} | {1,1,1,2,2,2,3,3,3}
9 | 3 | {8,9,10} | {8,9,10} | {2,3,3} | {2,2,2,3,3,3,4,4,4}
10 | 3 | {9,10,11} | {9,10,11} | {3,3,3} | {2,2,2,3,3,3,4,4,4}
11 | 3 | {10,11,12} | {10,11,12} | {3,3,4} | {2,2,2,3,3,3,4,4,4}
12 | 4 | {11,12,13} | {11,12,13} | {3,4,4} | {3,3,3,4,4,4,5}
13 | 4 | {12,13,14} | {12,13,14} | {4,4,4} | {3,3,3,4,4,4,5}
14 | 4 | {13,14,15} | {13,14,15} | {4,4,5} | {3,3,3,4,4,4,5}
15 | 5 | {14,15} | {14,15} | {4,5} | {4,4,4,5}
(15 rows)

After listing the x and y columns, I have applied windowing functions on x. As you can see, the results are the same for both columns. rows_1 and range_1 are absolutely identical. The situation changes if we start to use the column containing those duplicates. In the case of ROWS, PostgreSQL simply takes the previous and the next rows. In the case of RANGE, it takes the entire group of duplicates. Hence, the array is a lot longer. The entire group of identical values is taken.

主站蜘蛛池模板: 巴林左旗| 开封县| 儋州市| 关岭| 衡南县| 大同县| 波密县| 黄石市| 神池县| 吉隆县| 海宁市| 辉县市| 大港区| 昭觉县| 崇文区| 保靖县| 杂多县| 定兴县| 柘城县| 神木县| 洞头县| 和平县| 彭水| 金华市| 桑植县| 甘德县| 安仁县| 洱源县| 平顺县| 柯坪县| 长春市| 慈溪市| 大同县| 邻水| 同心县| 崇信县| 沙雅县| 镇康县| 广汉市| 凤山县| 眉山市|