- IBM DB2 9.7 Advanced Administration Cookbook
- Adrian Neagu Robert Pelletier
- 596字
- 2021-08-20 15:33:23
The table distribution key and its role in a multipartitioned environment
The table distribution key is another element that can influence the distribution of table rows across existent partitions defined in a database partition group. Distribution keys can be created explicitly using DISTRIBUTION BY HASH
(columns) directive inside table definition or they can be defined implicitly by DB2 using a series of column rules detailed in this recipe. Usually distribution column might be designed and used to improve query performance.
Getting ready
This recipe is strongly correlated with the preceding one; here, we will also use before and after pictures of how the data is distributed across database partitions.
How to do it...
For some of the queries, we will use Command Editor, because it offers better visibility.
- To find the distribution key columns for the
COMM
table, issue the following command:[db2instp@nodedb21 ~]$ db2 "select name from sysibm.syscolumns where tbname='COMM' and partkeyseq !=0" NAME CKEY 1 record(s) selected. [db2instp@nodedb21 ~]$
Since we did not specify any distribution key at table creation, the primary key is used as the distribution key.
- To find the distribution of rows on partitions, by row numbers, before performing redistribution of table
COMM
, issue the following command (for better visibility run this statement in Command Center | Tools | Command Editor):select dbpartitionnum(CKEY) as PARTITION_NUMBER, count(CKEY) as NUMBER_OF_ROWS from comm group by dbpartitionnum(CKEY) order by dbpartitionnum(CKEY)
- After redistribution, issue the last statement again. Now we can see that rows are distributed slightly uniformly across partitions:

How it works...
Table rows partiticpating in distributed keys are assigned to partitions using a hashing algorithm that calculates their distribution across partitions. If you do not specify a distribution key at table creation, it will be created implicitly, as follows:
- If the table has a primary key, it will be used as a distribution key
- If the table does not have a primary key, the first column, whose data type is not
LOB, LONG VARCHAR, LONG VARGRAPHIC
, orXML
, will be used as a partition key - If the table columns are not in this category, it means that a partition key cannot be defined, and the table can only be created in a table space that has been created on a single-partition database partition group
Distribution keys should be created to improve query performance in the following conditions:
- Columns are frequently used in joins
- Columns are frequently used in group by clause
- Columns defined as primary keys or unique must be included in distribution keys
- Create distribution keys for column used mostly in equality conditions
Practically the key columns are hashed and divided in 4095 or 32768 buckets depending on the distribution map used. Every map entry indicates where a specific row on which partition might be found.
There's more...
You cannot change the distribution key by altering the table, unless the table is spread across only one partition. To change the partition key for tables spanned on multiple partitions you should follow these steps:
- Export or copy table data in a staging table.
- Drop and recreate the table, with the new distribution key defined.
- Import or insert table data.
Distribution keys play a major role also in a method of table distribution called table collocation. Table collocation is another performance tuning technique found in multipartitioned database environments. A table is considered collocated when the following conditions are met:
- Tables are placed in the same table space and are in the same partition groups.
- The distribution keys have the same columns and same data types and are partition compatible.
- Advanced Quantitative Finance with C++
- Learn TypeScript 3 by Building Web Applications
- GitLab Cookbook
- Java高并發核心編程(卷2):多線程、鎖、JMM、JUC、高并發設計模式
- Software Testing using Visual Studio 2012
- Learning ELK Stack
- Mastering Google App Engine
- SAP BusinessObjects Dashboards 4.1 Cookbook
- PhoneGap:Beginner's Guide(Third Edition)
- Learning ArcGIS for Desktop
- 機器學習與R語言實戰
- 一塊面包板玩轉Arduino編程
- 創意UI:Photoshop玩轉APP設計
- Mastering JavaScript
- Access數據庫應用教程(2010版)