- Oracle DBA手記 3:數據庫性能優化與內部原理解析
- 蓋國強 楊廷琨主編
- 2404字
- 2019-01-01 13:23:07
TRUNCATE語句時間過長的診斷
有人提出問題:執行TRUNCATE語句為什么比執行DELETE語句還慢。默認情況下TRUNCATE要回收空間,而且會將所有的臟塊寫回到數據文件,因此這個DDL語句會涉及很多后臺的遞歸調用。
基于上述原因,對于小數據量的表而言,TRUNCATE操作比DELETE慢也是正常的。不過在提問網友給出的例子中,TRUNCATE語句慢得太多。下面通過測試例子來看問題出在哪里。
測試案例
問題發生在網友的環境中,下面的這個例子是在找到問題原因之后,通過重新模擬問題得到的。為了更直觀地說明這個問題,把這個例子放在最開始的部分:表中只有5條記錄時TRUNCATE操作要比DELETE慢得多:
SQL> SET TIMING ON SQL> DELETE T_BIG; 4 rows deleted. Elapsed: 00:00:00.01 SQL> TRUNCATE TABLE T_BIG; Table truncated. Elapsed: 00:00:03.25
如果表中數據量不大,TRUNCATE比DELETE慢一點是正常的,但是二者的執行時間一般是同一個數量級的:
SQL> CREATE TABLE T_TRUNCATE (ID NUMBER); Table created. Elapsed: 00:00:00.06 SQL> INSERT INTO T_TRUNCATE VALUES (1); 1 row created. Elapsed: 00:00:00.00 SQL> DELETE T_TRUNCATE; 1 row deleted. Elapsed: 00:00:00.03 SQL> TRUNCATE TABLE T_TRUNCATE; Table truncated. Elapsed: 00:00:00.10
上面是一個正常的例子,而文章開始模擬的例子中,TRUNCATE語句和DELETE語句相差了幾個數量級,這肯定不是“數據量較小”就可以解釋得通的。
提問者除了給出TRUNCATE語句和DELETE語句的時間對比外,還對TRUNCATE語句設置了10046事件進行跟蹤(TRACE):
SQL> ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12'; Session altered. Elapsed: 00:00:00.04 SQL> TRUNCATE TABLE T_BIG; Table truncated. Elapsed: 00:00:01.52 SQL> ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF'; Session altered. Elapsed: 00:00:00.00
第二次運行TRUNCATE,雖然時間比第一次運行要短1/3,但是仍然要花費將近2秒的時間。
提問者使用TKPROF對得到的trace文件進行了格式化,并將格式化后的文件發到了論壇上,其中格式化后的結果類似于如下:
[oracle@yans1 udump]$ more trace_result.txt TKPROF: Release 10.2.0.3.0- Production on星期四 7月 1 15:07:052010 Copyright (c) 1982, 2005, Oracle. All rights reserved. TRUNCATE TABLE T_BIG call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.36 1.47 4453 365 9211 0 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.37 1.48 4453 365 9211 0 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 74 Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ db file sequential read 4452 0.00 0.06 local write wait 2229 0.00 0.67 ******************************************************************************** . . . OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 2 0.00 0.00 0 0 0 0 Execute 2 0.36 1.47 4453 365 9211 0 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.37 1.48 4453 365 9211 0 Misses in library cache during parse: 1 Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited . . . ---------------------------------------- Waited ---------- ------------ db file sequential read 4452 0.00 0.06 local write wait 2229 0.00 0.67 OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 63 0.01 0.10 0 0 0 0 Execute 80 0.04 0.14 0 110 39 28 Fetch 36 0.00 0.02 1 246 0 29 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 179 0.06 0.27 1 356 39 57 Misses in library cache during parse: 20 Misses in library cache during execute: 20 Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ db file sequential read 1 0.01 0.01 10 user SQL statements in session. 70 internal SQL statements in session. 80 SQL statements in session. ********************************************************************************
根據格式化后的trace信息可以做出下面的判斷:
Oracle的遞歸SQL的運行時間只占了很小的比重,絕大部分時間都是消耗在TRUNCATE語句上。
TRUNCATE語句居然產生了4000多次的物理讀和9000多次的邏輯讀,而表本身并沒有那么多的BLOCK存在。何況即使表本身比較大,在第一次執行了TRUNCATE操作以后,高水位線已經重置,再次執行TRUNCATE時,表本身應該已經很小才對。
TRUNCATE語句的主要等待事件是db file sequential read和local write wait,這兩個等待時間一般和索引有關,而與表的關系并不大。
僅根據現有的信息無法判斷具體的原因,因此要求提問者提供進一步的信息:包含表結構的DBMS_METADATA輸出結果以及TKPROF格式化之前的原始trace文件。
其中DBMS_METADATA.GET_DDL的輸出類似如下:
SQL> SET LONG 10000 SQL> SELECT DBMS_METADATA.GET_DDL('TABLE', 'T_BIG') FROM DUAL; DBMS_METADATA.GET_DDL('TABLE','T_BIG') -------------------------------------------------------------------------------- CREATE TABLE "TEST"."T_BIG" ( "OWNER" VARCHAR2(30), "OBJECT_NAME" VARCHAR2(128), "SUBOBJECT_NAME" VARCHAR2(30), "OBJECT_ID" NUMBER, "DATA_OBJECT_ID" NUMBER, "OBJECT_TYPE" VARCHAR2(19), "CREATED" DATE, "LAST_DDL_TIME" DATE, "TIMESTAMP" VARCHAR2(19), "STATUS" VARCHAR2(7), "TEMPORARY" VARCHAR2(1), "GENERATED" VARCHAR2(1), "SECONDARY" VARCHAR2(1), PRIMARY KEY ("OBJECT_ID") USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS STORAGE(INITIAL 1073741824 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE "GPO" ENABLE ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING STORAGE(INITIAL 973078528 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE "GPO"
其中表和主鍵索引的初始EXTENT的大小比較奇怪,既然表中記錄很少,為什么表和主鍵的初始化INITIAL EXTENT會那么大。
進一步分析詳細的trace信息如下:
[oracle@yans1 udump]$ more test08_ora_20627.trc /opt/ora10g/admin/test08/udump/test08_ora_20627.trc . . . ORACLE_HOME = /opt/ora10g/product/10.2.0/db_1 System name: Linux Node name: yans1 Release: 2.6.9-42.0.0.0.1.ELsmp Version: #1 SMP Sun Oct 15 15:13:57 PDT 2006 Machine: x86_64 Instance name: test08 Redo thread mounted by this instance: 1 Oracle process number: 16 Unix process pid: 20627, image: oracle@yans1 (TNS V1-V3) *** 2010-07-01 15:03:54.687 *** ACTION NAME:() 2010-07-01 15:03:54.673 *** MODULE NAME:(SQL*Plus) 2010-07-01 15:03:54.673 *** SERVICE NAME:(SYS$USERS) 2010-07-01 15:03:54.673 *** SESSION ID:(317.11354) 2010-07-01 15:03:54.673 WAIT #3: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=147960 tim=1248015463547954 WAIT #3: nam='SQL*Net message from client' ela= 5759303 driver id=1650815232 #bytes=1 p3=0 obj#=147960 tim=1248015469321192 XCTEND rlbk=0, rd_only=1 ===================== PARSING IN CURSOR #10 len=20 dep=0 uid=74 oct=85 lid=74 tim=1248015469332441 hv=3136993528 ad='ff547c80' TRUNCATE TABLE T_BIG END OF STMT PARSE #10:c=1000,e=9511,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=1248015469330783 BINDS #10: ===================== . . . EXEC #5:c=2000,e=2298,p=0,cr=2,cu=2,mis=1,r=1,dep=1,og=4,tim=1248015469436750 STAT #5 id=1 cnt=0 pid=0 pos=1 obj=0 op='UPDATE OBJ$ (cr=2 pr=0 pw=0 time=264 us)' STAT #5 id=2 cnt=1 pid=1 pos=1 obj=37 op='INDEX RANGE SCAN I_OBJ2 (cr=2 pr=0 pw=0 time=21 us)' WAIT #10: nam='db file sequential read' ela= 31 file#=51 block#=57365 blocks=1 obj#=147967 tim=1248015469436975 . . . WAIT #10: nam='db file sequential read' ela= 9 file#=54 block#=32261 blocks=1 obj#=147965 tim=1248015469849171 WAIT #10: nam='local write wait' ela= 257 file#=54 block#=32261 p3=0 obj#=147965 tim=1248015469849462 . . .
可以看到TRUNCATE語句,也就是CURSOR 10在運行過程中出現了大量的等待,等待事件主要就是前面提到的db file sequential read和local write wait,而等待的對象在不斷地變化,對應的OBJECT_ID從147967遞減到147960。
下面檢查這些對象:
SQL> SELECT OBJECT_NAME, OBJECT_TYPE 2 FROM DBA_OBJECTS 3 WHERE OBJECT_ID IN (147967, 147966, 147965, 147964, 147963, 147962, 147961, 147960); OBJECT_NAME OBJECT_TYPE ------------------------------ ------------------- IND_BIG_TEMP INDEX IND_BIG_TYPE INDEX IND_BIG_CREATED INDEX IND_BIG_STATUS INDEX IND_BIG_ID INDEX IND_BIG_NAME INDEX IND_BIG_OWNER INDEX T_BIG TABLE
這些索引就是T_BIG表上的索引,檢查這些索引的大小如下:
SQL> SELECT SEGMENT_NAME, SUM(BYTES)/1024/1024/1024 2 FROM USER_SEGMENTS 3 WHERE SEGMENT_NAME IN 4 (SELECT INDEX_NAME 5 FROM USER_INDEXES 6 WHERE TABLE_NAME = 'T_BIG') 7 GROUP BY SEGMENT_NAME; SEGMENT_NAME SUM(BYTES)/1024/1024/1024 ------------------------------ ------------------------- IND_BIG_ID 1 IND_BIG_OWNER 1 IND_BIG_CREATED 1 IND_BIG_NAME 1 IND_BIG_STATUS 1.0625 IND_BIG_TEMP 1.0625 IND_BIG_TYPE 1.0625 7 rows selected. SQL> SELECT INDEX_NAME, INITIAL_EXTENT 2 FROM USER_INDEXES 3 WHERE TABLE_NAME = 'T_BIG'; INDEX_NAME INITIAL_EXTENT ------------------------------ -------------- IND_BIG_OWNER 1073741824 IND_BIG_NAME 1073741824 IND_BIG_ID 1073741824 IND_BIG_STATUS 1098907648 IND_BIG_CREATED 1073741824 IND_BIG_TYPE 1098907648 IND_BIG_TEMP 1098907648 7 rows selected.
這些索引每個都是1GB大小,而且這1GB空間就是初始EXTENT的大小。導致問題的原因已經明確了:表包含了多個索引,且每個索引的初始段太大,使得TRUNCATE語句執行時對索引執行了大量的db file sequential read操作,從而導致了TRUNCATE語句的性能問題。
出現問題的原因找到以后,解決問題就很簡單了,只需要降低索引的初始EXTENT,就可以避免這個問題:
SQL> ALTER INDEX IND_BIG_OWNER REBUILD STORAGE (INITIAL 1M); Index altered. SQL> ALTER INDEX IND_BIG_NAME REBUILD STORAGE (INITIAL 1M); Index altered. SQL> ALTER INDEX IND_BIG_ID REBUILD STORAGE (INITIAL 1M); Index altered. SQL> ALTER INDEX IND_BIG_STATUS REBUILD STORAGE (INITIAL 1M); Index altered. SQL> ALTER INDEX IND_BIG_CREATED REBUILD STORAGE (INITIAL 1M); Index altered. SQL> ALTER INDEX IND_BIG_TYPE REBUILD STORAGE (INITIAL 1M); Index altered. SQL> ALTER INDEX IND_BIG_TEMP REBUILD STORAGE (INITIAL 1M); Index altered. SQL> SET TIMING ON SQL> TRUNCATE TABLE T_BIG; Table truncated. Elapsed: 00:00:00.27
重建索引后,TRUNCATE語句執行時間基本恢復正常了。注意到表的初始化EXTENT也將近1GB,對表進行 MOVE,縮小初始EXTENT,還可以進一步提高TRUNCATE操作的效率:
SQL> ALTER TABLE T_BIG MOVE STORAGE (INITIAL 1M);
Table altered.
Elapsed: 00:00:00.09
SQL> TRUNCATE TABLE T_BIG;
Table truncated.
Elapsed: 00:00:00.07
耗時恢復正常數量級。
小結
問題解決后深入分析,該問題看似是由于索引和表的INITIAL EXTENT造成的,但是真正導致問題的罪魁禍首是EXP的COMPRESS=Y。正是由于提問者在執行EXP導出時沒有設置COMPRESS=N,因此EXP會嘗試壓縮表的EXTENT:根據表中所有的EXTENT的大小創建一個很大的初始EXTENT,將表的全部EXTENT壓縮到一個EXTENT中。如果利用這種情況下導出的dmp文件導入到目標庫,就會像例子中那樣創建出初始EXTENT非常大的表和索引了。
- Redis使用手冊
- 企業數字化創新引擎:企業級PaaS平臺HZERO
- 達夢數據庫編程指南
- 大數據:從概念到運營
- Starling Game Development Essentials
- 白話大數據與機器學習
- AI時代的數據價值創造:從數據底座到大模型應用落地
- 金融商業算法建模:基于Python和SAS
- Hadoop大數據開發案例教程與項目實戰(在線實驗+在線自測)
- IPython Interactive Computing and Visualization Cookbook(Second Edition)
- INSTANT Android Fragmentation Management How-to
- Google Cloud Platform for Developers
- 改變未來的九大算法
- Filecoin原理與實現
- 數據分析思維:產品經理的成長筆記