- 數據庫高效優化:架構、規范與SQL技巧
- 馬立和 高振嬌 韓鋒
- 2126字
- 2020-08-03 16:49:23
7.1 表
“表”是大家最為熟知的一個對象。它也是保存數據的實體。Oracle數據庫支持多種表的類型。大家最為常見的就是堆表,它也是適用范圍最廣的一種表。此外,還支持索引組織表、簇表等。除了按照表的結構分類外,還可以根據表的組織形式、用途等進行分類,比如常見的分區表、臨時表等。
下面我們將從最常見的堆表開始介紹。
1.堆表
堆表是Oracle默認的表類型,也是最常用的表類型。除非有特殊原因要使用其他表類型,否則都使用堆表類型。對于堆表來說,最常見的影響性能的因素就是表的規模。這一點很容易理解,規模越大,掃描的塊數越多,當然成本也就越高。前面兩章提到,如果對表進行全表掃描,會掃描高水位線以下的所有塊。這也解釋了為什么刪除數據后,掃描表仍然很慢。下面通過一個示例說明。
SQL> create table t1 as select * from dba_objects; //表已創建 SQL> insert into t1 select * from t1; //已創建 18865 行 SQL> insert into t1 select * from t1; //已創建 37730 行 SQL> insert into t1 select * from t1; //已創建 75460 行 SQL> insert into t1 select * from t1; //已創建 150920 行 SQL> commit; //提交完成 //這里我們構造一張大表,并插入了幾十萬條記錄 SQL> set serveroutput on SQL> exec show_space('t1','auto'); Total Blocks............................4096 Total Bytes.............................33554432 Unused Blocks...........................0 Unused Bytes............................0 Last Used Ext FileId....................4 Last Used Ext BlockId...................6144 Last Used Block.........................128 PL/SQL 過程已成功完成。 /* 這里我們調用了一個存儲過程(附錄中會詳細說明)。通過這個存儲過程,我們可以觀察到表的高水位線信息。對于上面這個示例,高水位線的位置在Total Blocks – Unused Blocks = 4096 */ SQL> set autotracetraceonly SQL> select count(*) from t1; ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1115 (2)| 00:00:14 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| T1 | 4559K| 1115 (2)| 00:00:14 | ------------------------------------------------------------------- 統計信息 ---------------------------------------------------------- 5903 consistent gets 4059 physical reads //通過上面執行的SQL語句可見,這個查詢語句大約要執行5000多次邏輯讀操作 SQL> delete from t1; //已刪除301840行 SQL> commit; //提交完成 SQL> exec show_space('t1','auto'); Total Blocks............................4096 Total Bytes.............................33554432 Unused Blocks...........................0 Unused Bytes............................0 Last Used Ext FileId....................4 Last Used Ext BlockId...................6144 Last Used Block.........................128 //PL/SQL 過程已成功完成 //刪除操作后,我們通過觀察發現高水位線沒有變化 SQL> select count(*) from t1; ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1096 (1)| 00:00:14 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| T1 | 1 | 1096 (1)| 00:00:14 | ------------------------------------------------------------------- 統計信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 5166 consistent gets 4026 physical reads //刪除之后執行查詢,仍然需要5000多次的邏輯讀操作 SQL> truncate table t1; //表被截斷 SQL> set autotrace off SQL> exec show_space('t1','auto'); Total Blocks............................8 Total Bytes.............................65536 Unused Blocks...........................5 Unused Bytes............................40960 Last Used Ext FileId....................4 Last Used Ext BlockId...................2096 Last Used Block.........................3 //PL/SQL 過程已成功完成 //截斷表后,高水位線明顯降低了 SQL> select count(*) from t1; ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| T1 | 1 | 2 (0)| 00:00:01 | ------------------------------------------------------------------- 統計信息 ---------------------------------------------------------- 1 recursive calls 0 db block gets 4 consistent gets 1 physical reads /* 高水位線降低后,再次執行查詢語句,可見其邏輯讀非常小。這也說明了降低高水位線對全表掃描的影響 */
2.索引組織表
索引組織表,顧名思義,就是存儲在一個索引結構中的表,也就是以B+樹結構存儲。換句話說,在索引組織表中,索引就是數據,數據就是索引,兩者合二為一。索引組織表的好處并不在于解決磁盤空間的占用,而是可以減少I/O,進而減少訪問緩沖區緩存。
下面我們通過一個示例說明普通堆表與索引組織表的訪問對比。
SQL> create table t_normal( aint,bint,c varchar2(100)); //表已創建 SQL> create index idx_normal_a on t_normal(a); //索引已創建 SQL> insert into t_normal select rownum,object_id,object_name from dba_objects; //已創建 18867 行 SQL> commit; //提交完成 SQL> create table t_iot(a int,bint,c varchar2(100),primary key(a)) organization index; //表已創建 SQL> insert into t_iot select rownum,object_id,object_name from dba_objects; //已創建 18869 行 SQL> commit; //提交完成 //上面分別創建了普通表和索引組織表,并插入了相同數據 SQL> set autotracetraceonly SQL> select * from t_normal where a=1000; -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 78 | 1 (0)| 00:00 | 1 | TABLE ACCESS BY INDEX ROWID | T_NORMAL | 1 | 78 | 1 (0)| 00:00 |* 2 | INDEX RANGE SCAN | IDX_NORMAL_A | 74 | | 1 (0)| 00:00 -------------------------------------------------------------------------------- 統計信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads //從上面輸出可見,在堆表中訪問這條記錄需要4個邏輯讀操作。從執行計劃可見,需要一個回表查詢 SQL> select * from t_iot where a=1000; -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 78 | 1 (0)| 00:00:01 | |* 1 | INDEX UNIQUE SCAN | SYS_IOT_TOP_21676 | 1 | 78 | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------- 統計信息 ---------------------------------------------------------- 1 recursive calls 0 db block gets 2 consistent gets 0 physical reads /* 從上面輸出可見,在索引組織表中訪問這條記錄需要2個邏輯讀,相較堆表訪問大大減少了。從執行計劃可見,可以直接訪問,不需要回表;或者說,就是通過索引直接訪問數據 */
3.分區表
分區表是Oracle數據庫中應對大規模數據量的一種很好的解決方案。其基本原理很簡單,就是將大的對象分解為若干個小對象。當訪問表時,根據分區策略,可以精確地定位到小對象(單個分區),進而提高訪問效率。需要說明的是,分區表中的每個分區也是一個獨立的堆表,只是邏輯上看起來是一張完整的表。
下面通過一個示例說明普通表與分區表的訪問區別。
[hf@testdb] SQL> create table t_part 2 ( 3 owner varchar2(30), 4 object_namevarchar2(128), 5 object_id number, 6 created date 7 ) 8 partition by range (created) 9 ( 10 partition part_201305 values less than(to_date('2013-06-01','yyyy-mm-dd')), 11 partition part_201306 values less than(to_date('2013-07-01','yyyy-mm-dd')), 12 partition part_201307 values less than(to_date('2013-08-01','yyyy-mm-dd')), 13 partition part_201308 values less than(to_date('2013-09-01','yyyy-mm-dd')), 14 partition part_201309 values less than(to_date('2013-10-01','yyyy-mm-dd')), 15 partition part_201310 values less than(to_date('2013-11-01','yyyy-mm-dd')), 16 partition part_201311 values less than(to_date('2013-12-01','yyyy-mm-dd')), 17 partition part_201312 values less than(to_date('2014-01-01','yyyy-mm-dd')), 18 partition part_201401 values less than(to_date('2014-02-01','yyyy-mm-dd')), 19 partition part_201402 values less than(to_date('2014-03-01','yyyy-mm-dd')), 20 partition part_201403 values less than(to_date('2014-04-01','yyyy-mm-dd')), 21 partition part_201404 values less than(to_date('2014-05-01','yyyy-mm-dd')), 22 partition part_201405 values less than(to_date('2014-06-01','yyyy-mm-dd')), 23 partitionpart_max values less than(maxvalue) 24 ); Table created. [hf@testdb] SQL> create table t_normal 2 ( 3 owner varchar2(30), 4 object_namevarchar2(128), 5 object_id number, 6 created date 7 ); Table created. [hf@testdb] SQL> insert into t_normal select owner,object_name,object_id,created from sys.dba_objects; 86299 rows created. [hf@testdb] SQL> commit; Commit complete. [hf@testdb] SQL> insert into t_part select owner,object_name,object_id,created from sys.dba_objects; 86299 rows created. [hf@testdb] SQL> commit; Commit complete. //上面創建了兩張表,并插入了數萬條記錄 [hf@testdb] SQL> exec dbms_stats.gather_table_stats('hf', 't_normal'); PL/SQL procedure successfully completed. [hf@testdb] SQL> exec dbms_stats.gather_table_stats('hf', 't_part'); PL/SQL procedure successfully completed. [hf@testdb] SQL> select * from t_part where created between to_date('2013-11-01', 'yyyy-mm-dd') and to_date('2013-11-30','yyyy-mm-dd'); -------------------------------------------------------------------------------- | Id | Operation | Name |Rows|Bytes| Cost(%CPU)| Time |Pstart|Pstop -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 105 | 2 (0)| 00:00:01| | | 1 | PARTITION RANGE SINGLE | | 1 | 105 | 2 (0)| 00:00:01| 7| 7 |* 2 | TABLE ACCESS FULL |T_PART| 1 | 105 | 2 (0)| 00:00:01| 7| 7 -------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 0 consistent gets 0 physical reads //從分區表的訪問可見,沒有消耗邏輯讀。原因是該分區內沒有數據,所以無須讀取 [hf@testdb] SQL> select * from t_normal where created between to_date('2013-11-01', 'yyyy-mm-dd') and to_date('2013-11-30','yyyy-mm-dd'); ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 75 | 3300 | 171 (1)| 00:00:03 | |* 1 | TABLE ACCESS FULL | T_NORMAL | 75 | 3300 | 171 (1)| 00:00:03 | ------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 12 recursive calls 0 db block gets 643 consistent gets 0 physical reads /* 如果直接對表進行訪問,則需要600多次邏輯讀。由此可見,通過分區訪問可以更精確地定位數據,減少訪問規模 */
推薦閱讀
- Python數據分析、挖掘與可視化從入門到精通
- Enterprise Integration with WSO2 ESB
- 大數據Hadoop 3.X分布式處理實戰
- 一個64位操作系統的設計與實現
- 數據庫技術及應用教程
- 深入淺出Greenplum分布式數據庫:原理、架構和代碼分析
- Hadoop大數據開發案例教程與項目實戰(在線實驗+在線自測)
- Hadoop集群與安全
- Python 3爬蟲、數據清洗與可視化實戰
- 企業級大數據項目實戰:用戶搜索行為分析系統從0到1
- Oracle 內核技術揭密
- 數據分析方法及應用:基于SPSS和EXCEL環境
- 一本書講透數據治理:戰略、方法、工具與實踐
- 工業大數據分析實踐
- MySQL應用實戰與性能調優