不朽情缘技巧免费旋转技巧

書(shū)名：大話Oracle Grid：云時(shí)代的RAC
作者名：張曉明
本章字?jǐn)?shù)： 137字
更新時(shí)間： 2019-01-02 08:20:41

第2章安裝引發(fā)的思考

在第1章中，我們裝了Oracle Grid、Oracle Database、創(chuàng)建數(shù)據(jù)庫(kù)，搞定了Oracle RAC 11.2的部署。對(duì)于有10.2 RAC經(jīng)驗(yàn)的朋友來(lái)說(shuō)，一定能對(duì)11.2的諸多變化有所感觸。因此，我們先不急著深入到Grid的內(nèi)部，我們先把在安裝過(guò)程中遇到的這些變化梳理一下，對(duì)11.2的Grid有個(gè)直觀的認(rèn)識(shí)。

2.1 怎么有這么多用戶和用戶組

在Oracle 10.2 RAC的部署中，我們只需要一個(gè)用戶（oracle）和一個(gè)用戶組（dba），不管是Clusterware還是Database，都是用oracle安裝的，只是在最后執(zhí)行root.sh腳本時(shí)才切換到root用戶。

在部署Oracle 11.2的過(guò)程中，我們創(chuàng)建了兩個(gè)用戶（oracle、grid）和5個(gè)用戶組（oinstall、dba、asmadmin、asmdba和asmoper，一共可以用到6個(gè)用戶組，其中oper和asmoper兩個(gè)組是可選的）。Grid是用grid用戶安裝的，Database是oracle用戶安裝的。

繼續(xù)閱讀之前，考慮下面這幾個(gè)問(wèn)題。

Oracle搞出這么多花樣目的何在？是別有用心，還是另有隱情？

是誰(shuí)在管理ASM？

是誰(shuí)在管理數(shù)據(jù)庫(kù)？和ASM是一個(gè)人嗎？如果不是，這兩個(gè)人是如何交叉的？

這兩個(gè)用戶的主組都是oinstall，你認(rèn)為權(quán)限和這個(gè)有關(guān)系嗎？

如果讀者對(duì)這些問(wèn)題都似是而非，那么就繼續(xù)閱讀下面的內(nèi)容。先來(lái)看一看單實(shí)例環(huán)境下（不使用ASM存儲(chǔ)的單實(shí)例）的用戶和用戶組。

2.1.1 老朋友

單實(shí)例（single-instance）環(huán)境中常用的3個(gè)操作用戶組分別是oinstall、dba、oper。

1．oinstall

這個(gè)組也叫Oracle產(chǎn)品清單組，代表Oracle軟件的“所有者”。怎么會(huì)有這么個(gè)組呢？ Oracle公司現(xiàn)在是個(gè)巨無(wú)霸，它有好多的軟件，光數(shù)據(jù)庫(kù)就有Oracle Database、TimesTen、Berkeley DB、MySQL幾個(gè)不同的產(chǎn)品線，還有中間件（Weblogic、Tuxedo），有BI系統(tǒng)，還有很多我根本不知道干什么的系統(tǒng)。這些軟件都可以從 OTN 上免費(fèi)下載使用，所以我們的機(jī)器上很可能會(huì)裝了一堆的Oracle軟件，或者裝了一個(gè)Oracle軟件的好幾個(gè)版本。

Oracle會(huì)記錄機(jī)器上都裝了哪些軟件及哪個(gè)版本。這份記錄就是Oracle的產(chǎn)品清單，Oracle的大部分產(chǎn)品都會(huì)支持產(chǎn)品清單，而且是共用一份產(chǎn)品清單。

Oinstall這個(gè)組的成員就擁有對(duì)“Oracle 產(chǎn)品清單”（oraInventory）的寫(xiě)權(quán)限。

在一個(gè)系統(tǒng)上首次安裝Oracle的軟件時(shí)（不必是Oracle數(shù)據(jù)庫(kù)，可以是任何一款產(chǎn)品），Oracle的安裝程序OUI都會(huì)創(chuàng)建一個(gè)/etc/oraInst.loc文件（AIX或者Linux）。如果是Sun平臺(tái)，則是/var/opt/oracle/oraInst.loc這個(gè)文件。這個(gè)文件的內(nèi)容一般是這樣的：

inventory_loc=/u01/app/oracle/oraInventory

inst_group=oinstall

這個(gè)文件還不是“Oracle 產(chǎn)品清單”。這個(gè)文件只是記錄了“Oracle 產(chǎn)品清單組”的組名以及“Oracle產(chǎn)品清單”文件的位置。就這個(gè)文件而言，我們可以得到這樣的信息：

（1）Oracle產(chǎn)品清單組的組名是oinstall；

（2）Oracle產(chǎn)品清單記錄在/u01/app/oracle/oraInventory這個(gè)目錄中。

真正的清單文件是這個(gè) ContentsXML 目錄下的 inventory.xml 文件，這個(gè)文件中記錄了機(jī)器上安裝的各種產(chǎn)品，下面就是一個(gè)例子。

[oracle@dbp ContentsXML]$ more inventory.xml

<?xml version="1.0" standalone="yes" ?>

<VERSION_INFO>

<SAVED_WITH>10.2.0.1.0</SAVED_WITH>

<MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>

</VERSION_INFO>

<HOME_LIST>

<NODE_LIST>

</NODE_LIST>

</HOME>

<NODE_LIST>

</NODE_LIST>

</HOME>

</HOME_LIST>

</INVENTORY>

Oracle軟件的安裝、卸載功能都會(huì)依賴并維護(hù)這個(gè)文件。對(duì)一個(gè)新環(huán)境，我們可以通過(guò)這個(gè)文件來(lái)了解它上面軟件的安裝情況。有些時(shí)候，這個(gè)文件還能解決一些怪異的問(wèn)題。后面的2.2小節(jié)就使用這個(gè)文件解決了一個(gè)實(shí)際問(wèn)題。

在RAC環(huán)境下創(chuàng)建oinstall用戶組時(shí)，要保證各節(jié)點(diǎn)上用戶組的gid一致，保險(xiǎn)的做法是在創(chuàng)建用戶組時(shí)明確指定組的ID，避免用系統(tǒng)自動(dòng)生成組ID時(shí)可能造成的不一致。

如果沒(méi)有oinstall組，Oracle也要找個(gè)組來(lái)充當(dāng)“Oracle產(chǎn)品清單”的角色。默認(rèn)情況下，安裝程序會(huì)把Grid安裝者所在的主組當(dāng)作“Oracle產(chǎn)品清單組”。因此一定要保證所有規(guī)劃的Oracle軟件安裝用戶都把這個(gè)組當(dāng)作自己的主組。也就是說(shuō)，我們計(jì)劃安裝Grid、Database這兩個(gè)軟件，而它們各自的安裝用戶分別是grid和oracle，我們就必須將grid和oracle用戶的主組設(shè)置為oinstall，這也是為什么在創(chuàng)建用戶時(shí)會(huì)使用-g這個(gè)參數(shù)。

#/usr/sbin/groupadd -g 505 oinstall

這個(gè)組很關(guān)鍵，我們必須要重視這個(gè)組。Oracle 11gR2 的 RAC 的安裝和 10g 很不一樣，在Oracle 10g中，Clusterware和Database兩個(gè)軟件的安裝用戶都是oracle，所以不會(huì)有訪問(wèn)權(quán)限的問(wèn)題。而到了Oracle 11gR2時(shí)，原來(lái)的Clusterware換成了Grid，又多了一個(gè)grid用戶。必須保證oracle、grid這兩個(gè)用戶都屬于oinstall，這樣才能保證在兩個(gè)軟件的安裝過(guò)程都有權(quán)限訪問(wèn)“產(chǎn)品清單”，否則安裝過(guò)程中會(huì)遇到“oraInventory無(wú)法訪問(wèn)”的權(quán)限錯(cuò)誤。

2．dba組（OSDBA用戶組）

OSDBA是我們必須要?jiǎng)?chuàng)建的一個(gè)系統(tǒng)級(jí)的用戶組（習(xí)慣叫dba），如果沒(méi)有這個(gè)用戶組，我們就無(wú)法安裝數(shù)據(jù)庫(kù)軟件和進(jìn)行后續(xù)的數(shù)據(jù)庫(kù)管理任務(wù)。

設(shè)置OSDBA組是和Oracle的操作系統(tǒng)身份驗(yàn)證有關(guān)的。屬于這個(gè)組的用戶，可以在通過(guò)操作系統(tǒng)身份驗(yàn)證后，通過(guò)SQL*Plus以SYSDBA身份連接到Oracle數(shù)據(jù)庫(kù)實(shí)例。這個(gè)組的成員有權(quán)執(zhí)行一些關(guān)鍵的、也是危險(xiǎn)的管理任務(wù)，比如創(chuàng)建數(shù)據(jù)庫(kù)、啟動(dòng)和關(guān)閉數(shù)據(jù)庫(kù)。這個(gè)組的默認(rèn)名稱就為dba。SYSDBA系統(tǒng)權(quán)限甚至允許在數(shù)據(jù)庫(kù)還沒(méi)打開(kāi)時(shí)訪問(wèn)數(shù)據(jù)庫(kù)實(shí)例。對(duì)此權(quán)限的控制完全超出了數(shù)據(jù)庫(kù)本身的范圍。

不要混淆SYSDBA系統(tǒng)權(quán)限與數(shù)據(jù)庫(kù)角色DBA。DBA角色不包括SYSDBA或SYSOPER系統(tǒng)權(quán)限。也就是說(shuō)，即使DBA組成員，也要明確指明要以SYSDBA的權(quán)限登錄，才能得到SYSDBA的權(quán)限。像這樣：

oracle> sqlplus ‘ / as sysdba’

3．oper組（OSOPER用戶組）

OSOPER組是一個(gè)可選的組。這個(gè)組也是和Oracle操作系統(tǒng)身份認(rèn)證功能有關(guān)的，屬于這個(gè)組的成員可通過(guò)操作系統(tǒng)身份驗(yàn)證使用SQL*Plus以SYSOPER身份連接到Oracle實(shí)例。這個(gè)可選組的成員擁有一組有限的數(shù)據(jù)庫(kù)管理權(quán)限，比如可以做備份。這個(gè)組的默認(rèn)組名就是oper。要使用該組，在安裝Oracle數(shù)據(jù)庫(kù)軟件的過(guò)程中要選擇“Advanced安裝類型”進(jìn)行安裝。

既然是可選的，那我們既可以創(chuàng)建也可以不創(chuàng)建這個(gè)用戶組，創(chuàng)建這個(gè)用戶組的目的是讓一些操作系統(tǒng)的用戶也能夠行使某些數(shù)據(jù)庫(kù)的管理權(quán)限（包括 SYSOPER 角色權(quán)限）。注意SYSOPER的權(quán)限包括startup和shutdown，所以要小心為該用戶組添加成員。

創(chuàng)建OSOPER用戶組的方法：

# /usr/sbin/groupadd oper

綜上所述，在單實(shí)例環(huán)境（single-instance）中，Oracle Database軟件的安裝者也是所有者通常都是oracle這個(gè)操作系統(tǒng)用戶，oracle用戶同時(shí)也是oinstall、dba、oper用戶組的成員。同時(shí)oracle用戶的主組必須是oinstall。也就是創(chuàng)建oracle用戶是這樣創(chuàng)建的：

useradd –g oinstall –G dba,oper oracle

在數(shù)據(jù)庫(kù)軟件的安裝過(guò)程中，需要指定操作系統(tǒng)用戶組，界面如圖2-1所示。

圖2-1 選擇Privileged Operating System用戶

這個(gè)界面中有兩個(gè)下拉列表框，這兩個(gè)列表框就是用來(lái)選擇之前說(shuō)的OSDBA和OSOPER兩個(gè)組對(duì)應(yīng)的操作系統(tǒng)的用戶組組名。這里的選擇，會(huì)影響到$ORACLE_HOME/rdbms/lib/config.c這個(gè)文件，這個(gè)文件中定義了SS_DBA_GRP和SS_OPER_GRP兩個(gè)宏：

/* Refer to the Installation and User's Guide for further information.*/

/* IMPORTANT: this file needs to be in sync with

rdbms/src/server/osds/config.c, specifically regarding the

number of elements in the ss_dba_grp array.

#define SS_DBA_GRP "dba"

#define SS_OPER_GRP "oper"

#define SS_ASM_GRP ""

char *ss_dba_grp[] = {SS_DBA_GRP, SS_OPER_GRP, SS_ASM_GRP};

2.1.2 集群環(huán)境的用戶組

在Oracle 10g中，Clusterware和Database都是由DBA一個(gè)角色進(jìn)行管理的。在Oracle 11gR2的RAC中，Oracle開(kāi)始主張把集群環(huán)境和數(shù)據(jù)庫(kù)的管理拆開(kāi)。其實(shí)這也是Oracle的策略使然，Oracle由原來(lái)唱衰“云”變成了積極的“云”推動(dòng)者，顯然它看到了“云”市場(chǎng)的巨大商機(jī)。

而Oracle的各條產(chǎn)品線中，也只有Grid（Clusterware）最有可能扛起“Oracle云”的大旗。因此，Oracle非常迫切地要給這個(gè)產(chǎn)品去掉數(shù)據(jù)庫(kù)的烙印。Oracle現(xiàn)在格外強(qiáng)調(diào)它的GI（Grid Infrastructure）是一個(gè)基礎(chǔ)架構(gòu)，而RAC數(shù)據(jù)庫(kù)只是這個(gè)環(huán)境中一個(gè)普通的資源而已，對(duì)它的管理，普通的 DBA稍加培訓(xùn)就可以了。相反，對(duì)于架構(gòu)環(huán)境本身的管理，反而需要一個(gè)更獨(dú)立、更專業(yè)的角色來(lái)進(jìn)行。這也是它在強(qiáng)調(diào)的。

它這種觀點(diǎn)不僅僅是想想而已，也已經(jīng)落實(shí)到產(chǎn)品上了。在Oracle的Grid環(huán)境中，出現(xiàn)專門(mén)管理Grid的用戶、用戶組以及管理數(shù)據(jù)庫(kù)的用戶、用戶組就是極好的佐證。

于是在Oracle 11.2的Grid中，又多了3個(gè)專門(mén)管理ASM的用戶組。

1．a(chǎn)smadmin（OSASM）用戶組

要在 Oracle 11.2 環(huán)境中使用 ASM，必須創(chuàng)建 asmadmin（OSASM）用戶組，這個(gè)用戶組也是一個(gè)必需的組。這么做也是為了讓Oracle ASM管理員和Oracle Database管理員分屬不同的管理權(quán)限組。

OSASM 組的成員可通過(guò)操作系統(tǒng)身份驗(yàn)證使用 SQL*Plus 以 SYSASM 身份連接到一個(gè) Oracle ASM 的實(shí)例。SYSASM 是在 Oracle 11G R1 版中出現(xiàn)的權(quán)限，到了 Oracle 11gR2，這個(gè)權(quán)限已經(jīng)從SYSDBA中完全分離出來(lái)了。SYSASM權(quán)限不再有對(duì)RDBMS實(shí)例的訪問(wèn)權(quán)限。

用SYSASM取代SYSDBA主要是為了把存儲(chǔ)層的系統(tǒng)權(quán)限剝離出來(lái)，這樣對(duì)ASM的管理和數(shù)據(jù)庫(kù)管理之間有了清晰的責(zé)任劃分，有助于防止使用相同存儲(chǔ)的不同數(shù)據(jù)庫(kù)無(wú)意間覆蓋其他數(shù)據(jù)庫(kù)的文件。

OSASM組的成員會(huì)被賦予SYSASM權(quán)限，SYSASM權(quán)限可以執(zhí)行掛載和卸載磁盤(pán)組及其他的存儲(chǔ)管理任務(wù)。因?yàn)閷?duì)Grid的管理很大程度上就是對(duì)ASM的管理，所以這個(gè)組的成員可以同時(shí)管理Oracle Clusterware和Oracle ASM，因?yàn)镃lusterware+ASM=Grid。

（1）SYSASM權(quán)限。

在Oracle 10.2中，ASM實(shí)例的啟動(dòng)和RDBMS數(shù)據(jù)庫(kù)一樣，都需要管理員以sysdba的身份登錄后執(zhí)行startup命令。

到了Oracle 11.2時(shí)，管理員就不能再以sysdba的身份啟動(dòng)ASM數(shù)據(jù)庫(kù)了，必須以sysasm角色連接后才能進(jìn)行操作。

如果以sysdba身份執(zhí)行啟動(dòng)或者關(guān)閉ASM實(shí)例的命令，Oracle會(huì)提示權(quán)限不夠。下面就是關(guān)閉ASM實(shí)例的示例：

[root@searchdb2 ～]# sqlplus "sys/ *** as sysdba"

SQL*Plus: Release 11.2.0.1.0 Production on Thu Sep 1 10:18:14 2011

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

With the Real Application Clusters and Automatic Storage Management options

SQL> shutdown immediate;

ORA-01031: insufficient privileges

我們需要退出來(lái)后，以sysasm的身份重新連接到實(shí)例后，命令才能得以執(zhí)行：

[grid@indexserver4 ～]$ sqlplus " / as sysasm"

SQL*Plus: Release 11.2.0.2.0 Production on Wed Jun 13 15:56:44 2012

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

With the Real Application Clusters and Automatic Storage Management options

SQL> shutdown immediate;

ASM diskgroups volume disabled

ASM diskgroups dismounted

ASM instance shutdown

SQL> exit

（2）如何正確關(guān)閉ASM實(shí)例。

如果你恰好在一個(gè)Oracle 11.2 RAC上執(zhí)行了上面的shutdown命令時(shí)，就可能會(huì)出現(xiàn)無(wú)法關(guān)閉的情況，比如：

[grid@indexserver1 ～]$ export ORACLE_SID=+ASM1

[grid@indexserver1 ～]$ sqlplus " / as sysasm"

SQL*Plus: Release 11.2.0.2.0 Production on Thu Jun 14 16:05:08 2012

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

With the Real Application Clusters and Automatic Storage Management options

SQL> shutdown;

ORA-15097: cannot SHUTDOWN ASM instance with connected client (process 22989)

使用shutdown immediate命令也不行：

SQL> shutdown immediate;

ORA-15097: cannot SHUTDOWN ASM instance with connected client (process 22989)

SQL> exit

Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

With the Real Application Clusters and Automatic Storage Management options。

即使關(guān)閉了數(shù)據(jù)庫(kù)后，再執(zhí)行上面的命令，得到的結(jié)果也是一樣的。這是因?yàn)樵贠racle 11.2 RAC環(huán)境中，CRS和ASM的關(guān)系發(fā)生了變化。

在 Oracle 10g 中，ASM 里只能放 Oracle 數(shù)據(jù)庫(kù)文件，所有 ASM 只有一種客戶端，就是Oracle 數(shù)據(jù)庫(kù)。因此，在Oracle 10g 的環(huán)境下，我們關(guān)閉RAC 的順序是這樣的：關(guān)閉數(shù)據(jù)庫(kù)?關(guān)閉ASM?關(guān)閉CRS。

但是在Oracle 11gR2下，如果是用OUI來(lái)安裝，除了數(shù)據(jù)庫(kù)的數(shù)據(jù)文件之外，集群自己的OCR和Voting File也是放在ASM里的。

所以這里就遇到了問(wèn)題。因?yàn)榧何募彩欠旁?ASM 里的，這樣 CRSD 也成為了 ASM的客戶端。如果像Oracle 10g中那樣直接關(guān)閉ASM，就會(huì)因?yàn)檫€有客戶端連接到ASM實(shí)例而拋出上面的錯(cuò)誤。所以，在Oracle 11gR2下面，要停ASM實(shí)例，只能和CRS一起停掉才行。因此正確的關(guān)閉方法是關(guān)閉CRS。

在root用戶下這么做：

[root@indexserver1 ～]# crsctl stop crs

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on'indexserver1'

CRS-2673: Attempting to stop 'ora.crsd' on 'indexserver1'

CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'indexserver1'

CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'indexserver1'

CRS-2673: Attempting to stop 'ora.registry.acfs' on 'indexserver1'

CRS-2673: Attempting to stop 'ora.DATA.dg' on 'indexserver1'

CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'indexserver1' succeeded

CRS-2673: Attempting to stop 'ora.indexserver1.vip' on 'indexserver1'

CRS-2677: Stop of 'ora.indexserver1.vip' on 'indexserver1' succeeded

CRS-2672: Attempting to start 'ora.indexserver1.vip' on 'indexserver2'

CRS-2677: Stop of 'ora.registry.acfs' on 'indexserver1' succeeded

CRS-2676: Start of 'ora.indexserver1.vip' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.DATA.dg' on 'indexserver1' succeeded

CRS-2673: Attempting to stop 'ora.asm' on 'indexserver1'

CRS-2677: Stop of 'ora.asm' on 'indexserver1' succeeded

CRS-2673: Attempting to stop 'ora.ons' on 'indexserver1'

CRS-2677: Stop of 'ora.ons' on 'indexserver1' succeeded

CRS-2673: Attempting to stop 'ora.net1.network' on 'indexserver1'

CRS-2677: Stop of 'ora.net1.network' on 'indexserver1' succeeded

CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'indexserver1' has completed

CRS-2677: Stop of 'ora.crsd' on 'indexserver1' succeeded

CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'indexserver1'

CRS-2673: Attempting to stop 'ora.ctssd' on 'indexserver1'

CRS-2673: Attempting to stop 'ora.evmd' on 'indexserver1'

CRS-2673: Attempting to stop 'ora.asm' on 'indexserver1'

CRS-2673: Attempting to stop 'ora.mdnsd' on 'indexserver1'

CRS-2677: Stop of 'ora.asm' on 'indexserver1' succeeded

CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'indexserver1'

CRS-2677: Stop of 'ora.drivers.acfs' on 'indexserver1' succeeded

CRS-2677: Stop of 'ora.evmd' on 'indexserver1' succeeded

CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'indexserver1' succeeded

CRS-2677: Stop of 'ora.mdnsd' on 'indexserver1' succeeded

CRS-2677: Stop of 'ora.ctssd' on 'indexserver1' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on 'indexserver1'

CRS-2677: Stop of 'ora.cssd' on 'indexserver1' succeeded

CRS-2673: Attempting to stop 'ora.diskmon' on 'indexserver1'

CRS-2673: Attempting to stop 'ora.crf' on 'indexserver1'

CRS-2677: Stop of 'ora.crf' on 'indexserver1' succeeded

CRS-2673: Attempting to stop 'ora.gipcd' on 'indexserver1'

CRS-2677: Stop of 'ora.diskmon' on 'indexserver1' succeeded

CRS-2677: Stop of 'ora.gipcd' on 'indexserver1' succeeded

CRS-2673: Attempting to stop 'ora.gpnpd' on 'indexserver1'

CRS-2677: Stop of 'ora.gpnpd' on 'indexserver1' succeeded

CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'indexserver1' has completed

CRS-4133: Oracle High Availability Services has been stopped.

注意：不要直接kill掉ASM 進(jìn)程或者用shutdown abort來(lái)關(guān)閉ASM實(shí)例，這樣CRS也會(huì)掛掉。

2．a(chǎn)smdba（OSDBA for ASM group）用戶組

ASM數(shù)據(jù)庫(kù)管理員組（OSDBA for ASM）的成員是SYSASM權(quán)限的一個(gè)子集，這個(gè)組被賦予了對(duì)Oracle ASM所管理的文件的讀寫(xiě)權(quán)限。

Grid軟件的安裝者、所有者（一般都是grid）以及Oracle Database軟件的所有者（一般是oracle）必須是該組的成員，而那些需要訪問(wèn)由ASM管理的文件并且屬于OSDBA組（dba組）的用戶也必須是ASM的OSDBA組的成員。

因此，grid用戶和oracle用戶都需要屬于這個(gè)組。

3．a(chǎn)smoper（OSOPER for ASM）用戶組

這個(gè)組也是個(gè)可選組。如果需要單獨(dú)設(shè)置一個(gè)對(duì) ASM 實(shí)例有部分管理權(quán)限（ ASM 的SYSOPER權(quán)限，包括啟動(dòng)和停止Oracle ASM實(shí)例的權(quán)限）的操作系統(tǒng)用戶，那么我們就可以創(chuàng)建這個(gè)組。在默認(rèn)情況下，OSASM組的成員將擁有ASM的SYSOPER權(quán)限所授予的所有權(quán)限。

要想使用ASM操作員組，在安裝Grid軟件時(shí)必須選擇Advanced安裝類型。這時(shí)OUI會(huì)指定該組的名稱。

如果要擁有一個(gè)OSOPER for ASM組，則Grid軟件所有者（grid）也必須是這個(gè)組的一個(gè)成員。

在Grid安裝過(guò)程中，遇到得如圖2-2所示的這個(gè)界面，就是針對(duì)這3個(gè)用戶組的。

圖2-2 Grid安裝中遇到的ASM用戶組

現(xiàn)在再回顧一下之前看到的表1-2，希望你現(xiàn)在已經(jīng)對(duì)它深入了解了。

2.1.3 GI owner 和 DB owner 是否有必要分開(kāi)

現(xiàn)在我們?cè)賹?duì)這個(gè)問(wèn)題做一個(gè)深入的探討。

首先，Oracle Clusterware或者Oracle Grid本身只是Oracle RAC環(huán)境中的一個(gè)軟件，因此自然而然地看起來(lái)是應(yīng)該交給DBA管理的。不過(guò)，許多Clusterware的管理任務(wù)都需要以root的身份運(yùn)行，這些又超出了DBA本身的一畝三分地。所以說(shuō)把Clusterware交給系統(tǒng)管理員來(lái)維護(hù)也不是沒(méi)有道理的，而且ASM提供的是存儲(chǔ)管理（盡管是用Oracle軟件實(shí)現(xiàn)的軟存儲(chǔ)管理），按理也應(yīng)該是交給系統(tǒng)管理員或者存儲(chǔ)管理員（如果有的話）來(lái)維護(hù)。

在Oracle 10g中，我們可以把CRS和RDBMS分開(kāi)用兩個(gè)用戶安裝（盡管實(shí)際上沒(méi)有人這么做），從而實(shí)現(xiàn)一定程度的管理功能分隔。不過(guò)，Oracle 10g的ASM還是在RDBMS中的，而且10g 中的ASM 很多任務(wù)還是通過(guò)SQL 指令完成的，因此，Oracle 10g 中對(duì)ASM 的管理還主要是DBA的任務(wù)。

Oracle 11g提出了Role-separated Management的思想。把ASM和Clusterware集成為一體，進(jìn)而可以用不同的操作系統(tǒng)用戶組把DBA和ASM管理員隔離開(kāi)。另外，Oracle 11g提供了新的ASM配置助手（ASMCA）、命令行工具asmcnd，現(xiàn)在都可以完全地分配給存儲(chǔ)和系統(tǒng)管理員來(lái)完成了。

正是因?yàn)锳SM和Clusterware的集成，因此，即使你不打算使用Role-separated Management，我也建議你給Grid home和RDBMS home不同的owner。這樣如果以后想把任務(wù)分離出去，也提前預(yù)留出了操作空間。

2.2 DBCA不識(shí)別集群環(huán)境的解決辦法

這個(gè)問(wèn)題是之前所講的“Oracle產(chǎn)品清單”的一個(gè)真實(shí)案例。

如果讀者按照之前的步驟完成了集群的部署，你不妨在每一個(gè)節(jié)點(diǎn)上都運(yùn)行一下DBCA，看看是看到如圖2-3所示的這個(gè)界面呢？還是看到如圖2-4所示的這個(gè)界面？通常來(lái)說(shuō)，在一個(gè)集群環(huán)境中，執(zhí)行 runInstaller 的那個(gè)節(jié)點(diǎn)上的 DBCA 都能夠識(shí)別出集群環(huán)境，并給出正確的界面（如圖2-3所示），也就是Oracle RAC的歡迎頁(yè)面。

圖2-3 能夠識(shí)別集群環(huán)境的DBCA

圖2-4 沒(méi)有識(shí)別出集群環(huán)境的DBCA界面

而其他節(jié)點(diǎn)上的DBCA界面可能如圖2-4所示，這就是因?yàn)镈BCA不能識(shí)別集群環(huán)境所引起的問(wèn)題更多DBCA的問(wèn)題，可以參看http://docs.oracle.com/cd/E11882_01/install.112/e25666/dbcacrea.htm#BGBGGEAH 。如果單擊【Next】按鈕，就會(huì)發(fā)現(xiàn)全是單實(shí)例的內(nèi)容。

這是因?yàn)槭裁茨兀恐饕钱a(chǎn)品清單的問(wèn)題，我們分別對(duì)比兩個(gè)節(jié)點(diǎn)的內(nèi)容。

以下是能夠出現(xiàn)Oracle RAC Welcome page節(jié)點(diǎn)的內(nèi)容：

<?xml version="1.0" standalone="yes" ?>

<VERSION_INFO>

<SAVED_WITH>11.2.0.2.0</SAVED_WITH>

<MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>

</VERSION_INFO>

<HOME_LIST>

<HOME　NAME="Ora11g_gridinfrahome1"　LOC="/u01/app/11.2.0.2/grid"　TYPE="O"　IDX="1"　CRS="true">

<NODE_LIST>

</NODE_LIST>

</HOME>

<NODE_LIST>

</NODE_LIST>

</HOME>

</HOME_LIST>

</INVENTORY>

而以下是不能識(shí)別集群環(huán)境的節(jié)點(diǎn)的內(nèi)容：

<?xml version="1.0" standalone="yes" ?>

<VERSION_INFO>

<SAVED_WITH>11.2.0.2.0</SAVED_WITH>

<MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>

</VERSION_INFO>

<HOME_LIST>

<HOME　NAME="OraDb11g_home1"　LOC="/u01/app/oracle/11.2.0.2/database/griddatabase" TYPE="O" IDX="2">

<NODE_LIST>

</NODE_LIST>

</HOME>

<HOME　NAME="Ora11g_gridinfrahome1"　LOC="/u01/app/11.2.0.2/grid"　TYPE="O"　IDX="1" CRS="true">

<NODE_LIST>

</NODE_LIST>

</HOME>

</HOME_LIST>

</INVENTORY>

看出問(wèn)題了嗎？在出現(xiàn)問(wèn)題的節(jié)點(diǎn)上，GRID_HOME，即IDX=1的項(xiàng)目被放在最后了。把順序調(diào)整過(guò)來(lái)就可以了。

2.3 為什么不配時(shí)間服務(wù)了

集群環(huán)境中對(duì)節(jié)點(diǎn)時(shí)間一致的要求非常嚴(yán)格。Oracle 是用 SCN 來(lái)記錄數(shù)據(jù)庫(kù)的事務(wù)操作的，SCN基本上就是時(shí)間戳。想象一下，如果兩個(gè)節(jié)點(diǎn)時(shí)間有差別，很有可能出現(xiàn)明明節(jié)點(diǎn)1上的事務(wù)先執(zhí)行，節(jié)點(diǎn)2的事務(wù)后執(zhí)行，而從SCN上反映出來(lái)的卻是相反的。這會(huì)造成數(shù)據(jù)嚴(yán)重不一致。

因此，在集群環(huán)境中，如果出現(xiàn)節(jié)點(diǎn)時(shí)間上的不一致，就會(huì)導(dǎo)致集群的重構(gòu)，也就是某個(gè)節(jié)點(diǎn)會(huì)被重啟。這是由集群判斷節(jié)點(diǎn)掛起的方式?jīng)Q定的，一個(gè)大幅度的時(shí)間跳躍會(huì)讓集群錯(cuò)誤的認(rèn)為發(fā)生了嚴(yán)重的節(jié)點(diǎn)掛起，從而觸發(fā)節(jié)點(diǎn)隔離（fencing）。使用類似NTP這種時(shí)間同步方法，又沒(méi)有進(jìn)行精細(xì)的配置的話，是很容易造成這種大幅度的時(shí)間跳躍的。Oracle 10.2中有幾個(gè)有名的bug就是因?yàn)闀r(shí)間同步而造成節(jié)點(diǎn)重啟的。

所以，在11.2 RAC 中，時(shí)間服務(wù)仍然是需要的，也是必需的。但是我們上一章在做前期準(zhǔn)備時(shí)，并沒(méi)有做任何時(shí)間服務(wù)的配置。這是為何呢？因?yàn)镺racle 11.2引入了CTSS服務(wù)，這個(gè)服務(wù)會(huì)替我們考慮這些問(wèn)題。

在Oracle 11.2中，Oracle為了簡(jiǎn)化RAC的部署，做了大量的優(yōu)化，其中就包括對(duì)時(shí)間服務(wù)的簡(jiǎn)化。在 Oracle 11.2 中，我們有兩個(gè)時(shí)間同步機(jī)制可以選擇，可以使用操作系統(tǒng)提供的NTP服務(wù)，也可以使用Grid自帶的時(shí)間同步服務(wù)（CTSS，Cluster Time Synchronization Server Daemon）。

2.3.1 使用 NTP 服務(wù)

如果要使用操作系統(tǒng)自帶的NTP服務(wù)，需要修改NTP參數(shù)文件，在其中設(shè)置-x標(biāo)志，這樣可避免向前調(diào)整時(shí)間。完成配置的修改后，重啟NTP服務(wù)即可。

編輯/etc/sysconfig/ntpd文件：

# Drop root to id 'ntp:ntp' by default.

OPTIONS="-x -u ntp:ntp -p /var/run/ntpd.pid"

# Set to 'yes' to sync hw clock after successful ntpdate

SYNC_HWCLOCK=no

# Additional options for ntpdate

NTPDATE_OPTIONS=""

重啟NTP服務(wù)：

[root@indexserver3 grid_data]# service ntpd status

ntpd (pid 4250) is running...

[root@indexserver3 grid_data]# service ntpd stop

Shutting down ntpd: [ OK ]

[root@indexserver3 grid_data]# service ntpd start

ntpd: Synchronizing with time server: [ OK ]

Starting ntpd: [ OK ]

[root@indexserver3 grid_data]# ps -ef|grep ntpd

Ntp　　22244　　1　0　17:25 ?　　00:00:00 ntpd -x -u ntp:ntp -p /var/run/ntpd.pid

Root　22250　15074　0　17:25 pts/0　00:00:00 grep ntpd

2.3.2 使用 CTSS 服務(wù)

如果想使用Grid提供的集群時(shí)間同步服務(wù)，就需要卸載操作系統(tǒng)提供的NTP服務(wù)，或者干脆禁用它，后者需要進(jìn)行如下操作：

要禁用NTP服務(wù)，必須停止當(dāng)前的NTPD服務(wù)；

從初始化序列中禁用該服務(wù)；

刪除ntp.conf文件。

[root@searchdb1 ～]# service ntpd status

ntpd (pid 7447) is running...

[root@searchdb1 ～]# service ntpd stop

Shutting down ntpd: [ OK ]

[root@searchdb1 ～]# chkconfig ntpd

[root@searchdb1 ～]# chkconfig ntpd off

[root@searchdb1 ～]# mv /etc/ntp.conf /etc/ntp.conf.bak

2.3.3 CTSS 和 NTP 的關(guān)系

如果CTSS發(fā)現(xiàn)集群中所有節(jié)點(diǎn)上已經(jīng)配置或者運(yùn)行了NTP服務(wù)，則CTSS會(huì)以一種觀察員模式（Observer Mode）運(yùn)行，這種模式下 CTSS 只會(huì)在集群的alert 日志中記錄時(shí)間不一致的信息，但不會(huì)去調(diào)整。

如果CTSS發(fā)現(xiàn)集群中并不是所有節(jié)點(diǎn)都配置或者運(yùn)行了NTP服務(wù)，CTSS就會(huì)以一種主動(dòng)模式（Active Mode）運(yùn)行，和主節(jié)點(diǎn)同步系統(tǒng)時(shí)鐘。這種同步又分兩種方式。

當(dāng)一個(gè)節(jié)點(diǎn)加入到集群時(shí)，如果這個(gè)節(jié)點(diǎn)存在時(shí)間差異，但是這個(gè)差異在某個(gè)界限之內(nèi)，就會(huì)用一種步進(jìn)的方式進(jìn)行同步，也就是每次調(diào)整很小的一個(gè)幅度。如果時(shí)間差異超過(guò)了這個(gè)限，就不允許這個(gè)節(jié)點(diǎn)加入到集群，并在集群的alert日志中記錄一個(gè)消息。

在運(yùn)行過(guò)程中，如果節(jié)點(diǎn)和主節(jié)點(diǎn)時(shí)間發(fā)生了差異，會(huì)把系統(tǒng)時(shí)鐘加快或者減慢已達(dá)到重新的同步。這也叫做clock slewing。

說(shuō)明：CTSS永遠(yuǎn)不會(huì)把系統(tǒng)時(shí)鐘向前調(diào)整。Oracle 10.2 RAC 中就有因?yàn)槭冀K向前調(diào)整引起節(jié)點(diǎn)重啟的bug。

要想把 CTSS 從觀察者模式轉(zhuǎn)變成主動(dòng)模式，只需要去掉所有節(jié)點(diǎn)上的 NTP 服務(wù)即可，CTSS會(huì)發(fā)現(xiàn)這個(gè)變化，并改變模式。反過(guò)來(lái)也是一樣，也就是如果啟動(dòng)所有節(jié)點(diǎn)上的NTP服務(wù)，會(huì)導(dǎo)致CTSS轉(zhuǎn)換成觀察者模式。

可以使用以下方式來(lái)查看CTSS是運(yùn)行在什么模式下的：

[grid@indexserver4 ～]$ crsctl check ctss

CRS-4700: The Cluster Time Synchronization Service is in Observer mode.

這說(shuō)明CTSS是以觀察者模式運(yùn)行的。

2.4 IPMI是什么

IPMI即Integrating Intelligent Platform Management Interface，如何理解呢？

我們?cè)凇洞笤扥racle RAC》中談到過(guò)IO Fencing，或者踢出節(jié)點(diǎn)（evict node），也就是當(dāng)集群中的某個(gè)節(jié)點(diǎn)失去了響應(yīng)──沒(méi)有了磁盤(pán)心跳、對(duì)ping也沒(méi)有響應(yīng)，可能這個(gè)節(jié)點(diǎn)已經(jīng)關(guān)閉了，也可能負(fù)載太高、運(yùn)行緩慢，也有可能完全掛起了（hung）。這時(shí)，集群需要進(jìn)行重構(gòu)，也就是要把這個(gè)節(jié)點(diǎn)踢出集群環(huán)境，由剩下的節(jié)點(diǎn)重新組建新的集群。這樣做是為了解決腦裂風(fēng)暴。

而踢出一個(gè)節(jié)點(diǎn)其實(shí)就是要讓這個(gè)節(jié)點(diǎn)重啟，重啟有兩種做法，一種是suicide，另一種是STONITH（Shoot The Other Node In The Head）。

Oracle 11.2之前采用的是第一種方法，不過(guò)，發(fā)現(xiàn)一個(gè)節(jié)點(diǎn)該重啟和這個(gè)節(jié)點(diǎn)確實(shí)能夠重啟是兩回事。這個(gè)節(jié)點(diǎn)如果已經(jīng)完全掛起了，它其實(shí)是沒(méi)有辦法自己解決這個(gè)問(wèn)題的，這就是IPMI發(fā)揮作用的地方了。而IPMI采用的就是后一種辦法。

IPMI是一個(gè)工業(yè)標(biāo)準(zhǔn)。如果集群要使用IPMI，那么這個(gè)集群中的每個(gè)節(jié)點(diǎn)都要有一個(gè)BMC （Baseboard Management Controller），這個(gè)卡的Firmware要支持IPMI的1.5版，這個(gè)版本支持局域網(wǎng)內(nèi)的 IPMI。換句話說(shuō)，如果其他健康節(jié)點(diǎn)發(fā)現(xiàn)這個(gè)節(jié)點(diǎn)因?yàn)槟撤N原因沒(méi)反應(yīng)了，必然會(huì)通過(guò)局域網(wǎng)通知這個(gè)BMC卡重啟這個(gè)節(jié)點(diǎn)，最簡(jiǎn)單的方式就是暫時(shí)斷電再通電。不過(guò)，通過(guò) IPMI 來(lái)重啟節(jié)點(diǎn)，是各種努力都嘗試過(guò)后的最后一擊，當(dāng)然目的也是為了保護(hù)數(shù)據(jù)的一致性。如果機(jī)器配有BMC設(shè)備的話，可以配置IPMI。

2.5 ORACLE_BASE和ORACLE_HOME的區(qū)別

ORACLE_HOME我們都非常熟悉，ORACLE_BASE是Oracle 11.2部署時(shí)一個(gè)需要的變量。要理解這兩個(gè)目錄的關(guān)系，就需要理解Oracle的Optimal Flexible Architecture（OFA）標(biāo)準(zhǔn)。這是一個(gè)保證一致的目錄結(jié)構(gòu)和文件命名的標(biāo)準(zhǔn)。我之前并不遵循 OFA 標(biāo)準(zhǔn)，而是采用自己的一套。不過(guò)從Oracle 11.2以來(lái)，我開(kāi)始有意識(shí)地理解OFA，發(fā)現(xiàn)真的不錯(cuò)，所以推薦給讀者。

2.5.1 OFA 和軟件安裝

許多環(huán)境和資料都采用了OFA標(biāo)準(zhǔn)，理解這個(gè)標(biāo)準(zhǔn)很重要。圖2-5就是標(biāo)準(zhǔn)OFA的目錄結(jié)構(gòu)和文件名字規(guī)范。當(dāng)然這張圖中并沒(méi)有顯示所有的目錄和文件，不過(guò)關(guān)鍵的常用目錄和文件還是顯示出來(lái)了。

圖2-5 OFA標(biāo)準(zhǔn)目錄結(jié)構(gòu)

OFA中有幾個(gè)關(guān)鍵目錄需要知道，包括：

Oracle inventory目錄；

Oracle Base目錄（ORACLE_BASE）；

Oracle Home 目錄（ORACLE_HOME）；

Oracle Network目錄（TNS_ADMIN）；

Automatic Diagnostic Repository（ADR_HOME）。

1．Oracle Inventory目錄

注意這個(gè)目錄和下一個(gè)Oracle Base目錄的關(guān)系。

當(dāng)我初次接觸到 ORACLE_BASE 這個(gè)概念時(shí)，有了一個(gè)想當(dāng)然的判斷。以為這是一個(gè)樹(shù)根，而所有Oracle有關(guān)的東西，都是放在這個(gè)樹(shù)根下的。如果你的想法和我一樣，那就要注意了，這個(gè)目錄不屬于ORACLE_BASE。它是和ORACLE_BASE同級(jí)的一個(gè)目錄。

這個(gè)目錄用來(lái)保存本機(jī)上所安裝的 Oracle 軟件的目錄清單，本機(jī)上安裝的所有 Oracle 軟件都需要并且共享使用這個(gè)目錄，當(dāng)我們第一次安裝Oracle軟件時(shí)，Oracle使用下面的幾條規(guī)則來(lái)尋找這個(gè)目錄。

（1）是否有OFA-兼容的目錄結(jié)構(gòu)，所謂OFA兼容就是指這個(gè)目錄符合/u[01-09]/app這樣的命名規(guī)范。如果有，安裝程序就會(huì)在這個(gè)目錄下創(chuàng)建，比如/u01/app/oraInventory。

（2）如果oracle用戶的環(huán)境變量中定義了ORACLE_BASE變量，則安裝程序會(huì)在下面這個(gè)位置創(chuàng)建這個(gè)目錄：ORACLE_BASE/../oraInventory ，中間的兩個(gè)原點(diǎn)“..”代表的是ORACLE_BASE的上層目錄，也就是說(shuō)，oraInventory目錄是和ORACLE_BASE目錄在同一個(gè)層次。比如，如果ORACLE_BASE定義為/ora/app/oracle，則這個(gè)目錄就是/ora/app/oraInventory。

（3）如果安裝程序沒(méi)有找到OFA兼容的目錄結(jié)構(gòu)，也沒(méi)有發(fā)現(xiàn)ORACLE_BASE變量，則安裝器會(huì)在Oracle用戶的HOME目錄下創(chuàng)建這個(gè)目錄，也就是/home/oracle/oraInventory目錄。

2．Oracle Base目錄

Oracle Base目錄是Oracle軟件安裝的最頂層目錄。這個(gè)目錄下可以安裝多個(gè)版本的Oracle軟件，OFA標(biāo)準(zhǔn)里的Oracle Base目錄是這樣的：

/<mount_point>/app/<software_owner>

掛載點(diǎn)通常是像/u01、/ora01、/oracle這樣的名字。用戶可以根據(jù)自己的環(huán)境標(biāo)準(zhǔn)命名這個(gè)掛載點(diǎn)。

軟件擁有者通常都是oracle，這是你用來(lái)安裝Oracle軟件的操作系統(tǒng)用戶。因?yàn)椋粋€(gè)完整的Oracle Base目錄可能是這樣的：

/ora01/app/oracle

3．Oracle Home目錄

Oracle Home 目錄定義了每個(gè)特定軟件，比如Oracle Database 11g、Oracle Database 10g的安裝目錄。每個(gè)不同的產(chǎn)品或者同一產(chǎn)品的不同版本必須放在單獨(dú)目錄下。符合 OFA 標(biāo)準(zhǔn)的Oracle Home 目錄是這樣的：

ORACLE_BASE/product/<version>/<install_name>

在我們的環(huán)境中，版本可能是11.2.0.1、11.2.0.2、10.2.0.1，install_name可以是db_1、devdb。比如下面就是一個(gè)版本為11.2.0.1的數(shù)據(jù)庫(kù)：

/ora01/app/oracle/product/11.2.0.1/db_1

許多DBA，包括我自己都不喜歡ORACLE_HOME下的目錄db_1，也看不出它有什么用處。其實(shí) db_1 這種結(jié)構(gòu)是讓我們可以有多個(gè)單獨(dú)的二進(jìn)制：一個(gè)開(kāi)發(fā)環(huán)境、一個(gè)測(cè)試環(huán)境、一個(gè)生產(chǎn)環(huán)境，如果確定沒(méi)有必要使用這么多安裝，也可以去掉這個(gè)目錄。

4．GRID的Oracle Base和Oracle Home

不過(guò)Grid的ORACLE_BASE和ORACLE_HOME有所不同，Grid的ORACLE_HOME不能是ORACLE_BASE的子目錄，如果這么定義：

ORACLE_BASE=/u01/app/grid

ORACLE_HOME=/u01/app/grid/11.2.0.2

安裝會(huì)報(bào)錯(cuò)，如圖2-6和圖2-7所示。

圖2-6 指定安裝位置

圖2-7 安裝報(bào)錯(cuò)

Oracle的官方文檔是這樣解釋的：Even if you do not use the same software owner to install Grid Infrastructure (Oracle Clusterware and Oracle ASM) and Oracle Database, be aware that running the root.sh script during the Oracle Grid Infrastructure installation changes ownership of the home directory where clusterware binaries are placed toroot, and all ancestor directories to the root level (/) are also changed to root.For this reason, the Oracle Grid Infrastructure for a cluster home cannot be in the same location as other Oracle software.

也就是說(shuō)，在Grid安裝過(guò)程的root.sh會(huì)把Grid所在目錄的屬主改成root，而且會(huì)一直修改到頂層目錄，這樣一來(lái)就會(huì)影響到其他的Oracle軟件，所以，不能把Grid的ORACLE_HOME放到ORACLE_BASE的子目錄中。

所以，對(duì)于Grid用戶來(lái)說(shuō)，這兩個(gè)目錄應(yīng)該是平行的。

5．Oracle Network 目錄

一些Oracle工具使用TNS_ADMIN定位網(wǎng)絡(luò)配置文件，這個(gè)目錄位于ORACLE_HOME/network/admin，這個(gè)目錄中會(huì)包括sqlnet.ora、tnsnames.ora和listener.ora文件。

6．ADR目錄

ADR 目錄（Automatic Diagnostic Repository）是從 Oracle 11g 開(kāi)始出現(xiàn)的，這個(gè)目錄里的文件對(duì)于解決Oracle數(shù)據(jù)庫(kù)問(wèn)題很關(guān)鍵，這個(gè)目錄定義是ORACLE_BASE/diag/rdbms/<dbname>/<instancename>，其中 dbname 是數(shù)據(jù)庫(kù)的名字，instancename 是實(shí)例的名字。在單實(shí)例數(shù)據(jù)庫(kù)環(huán)境中，數(shù)據(jù)庫(kù)名字和實(shí)例名字是相同的，不過(guò)數(shù)據(jù)庫(kù)名字是小寫(xiě)的，實(shí)例名字是大寫(xiě)的。比如，下面這個(gè)testdb：

/ora01/app/oracle/diag/rdbms/testdb/TESTDB

7．ORACLE_BASE、ORACLE_HOME環(huán)境變量

現(xiàn)在我們已經(jīng)理解了 OFA 標(biāo)準(zhǔn)了，因此，在安裝之前，需要在安裝用戶的環(huán)境中指定ORACLE_BASE、ORACLE_HOME 兩個(gè)環(huán)境變量。Grid、Oracle 兩個(gè)用戶各自的設(shè)置是不同的。

Grid用戶的環(huán)境變量設(shè)置：

export ORACLE_BASE=/u01/app/grid

export ORACLE_HOME=/u01/app/11.2.0/grid

PATH=$ORACLE_HOME/bin:$PATH:$HOME/bin

Oracle用戶的環(huán)境變量設(shè)置：

export ORACLE_BASE=/u01/app/database

export ORACLE_HOME=$ORACLE_BASE/11.2.0.2

PATH=$ORACLE_HOME/bin:$PATH:$HOME/bin

2.5.2 ORACLE_HOME 是共享還是本地

在 RAC 集群中，每個(gè)節(jié)點(diǎn)上的 Oracle 軟件都要訪問(wèn) RAC 數(shù)據(jù)庫(kù)。這就會(huì)引發(fā)一個(gè)問(wèn)題——Oracle 軟件本身是放在一個(gè)共享存儲(chǔ)上（共享 ORACLE_HOME）或者放在每個(gè)節(jié)點(diǎn)的本地（本地ORACLE_HOME）。

使用共享ORACLE_HOME當(dāng)然有些好處，比如配置、空間需求、升級(jí)。不過(guò)，每次升級(jí)都必須要有完全的停機(jī)才行。而且，因?yàn)橹挥幸环?ORACLE_HOME，這很明顯是一個(gè)單點(diǎn)故障，這個(gè)共享空間的任何問(wèn)題都會(huì)導(dǎo)致數(shù)據(jù)庫(kù)的掛掉。

因此，對(duì)于ORACLE_HOME的建議還是使用本地ORACLE_HOME。千萬(wàn)不要舍不得那一點(diǎn)點(diǎn)空間浪費(fèi)，也不要害怕準(zhǔn)備環(huán)境的麻煩。

2.6 SCAN

在安裝Grid的過(guò)程中，需要填寫(xiě)一個(gè)叫做SCAN的項(xiàng)目（如圖1-9所示），SCAN即Single Client Access Name的縮寫(xiě)。關(guān)于SCAN，我會(huì)專門(mén)用單獨(dú)的一章來(lái)討論，本節(jié)只是先簡(jiǎn)單介紹一下。

來(lái)看一下Oracle 10.2 RAC的客戶端是如何訪問(wèn)數(shù)據(jù)庫(kù)的，這些客戶端的TNSNAMES.ORA應(yīng)該是這樣的：

Testdb =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = TCP)(HOST = center-rac1-vip)(PORT = 1521))

(ADDRESS = (PROTOCOL = TCP)(HOST = center-rac2-vip)(PORT = 1521))

(LOAD_BALANCE = yes)

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME =testdb)

)

客戶端的這個(gè)文件中會(huì)記錄RAC環(huán)境中每個(gè)節(jié)點(diǎn)的VIP地址，也就是說(shuō)客戶端必須要知道整個(gè)集群環(huán)境的拓?fù)浣Y(jié)構(gòu)。一旦集群的拓?fù)浒l(fā)生了改變，比如加入新的節(jié)點(diǎn)或者去掉某個(gè)節(jié)點(diǎn)，所有客戶端的TNSNAMES.ORA文件必須做修改，否則就可能無(wú)法繼續(xù)訪問(wèn)數(shù)據(jù)庫(kù)了。

也就是說(shuō)，Oracle RAC環(huán)境不能做到對(duì)用戶完全透明，這是一個(gè)問(wèn)題。SCAN就是用來(lái)解決這個(gè)問(wèn)題的。

SCAN本身是一個(gè)QFDN的域名，就像熟悉的網(wǎng)站一樣，我們?cè)L問(wèn)谷歌是通過(guò)www.google.com訪問(wèn)的，谷歌其實(shí)有上百上千臺(tái)服務(wù)器提供WWW服務(wù)，我們根本不知道連的是哪一臺(tái)，也不需要關(guān)心，反正只要能打開(kāi)網(wǎng)頁(yè)，后面的服務(wù)器怎么變化都跟我們無(wú)關(guān)。也就是說(shuō)谷歌的服務(wù)器對(duì)于客戶是透明的。

SCAN的效果也是一樣的，使用SCAN后，客戶端的TNSNAMES.ORA是這樣的：

Testdb =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = TCP)(HOST = indexgrid.wxxr.com.cn)(PORT = 1521))

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = testdb)——

)

現(xiàn)在，客戶端所需要知道的就是一個(gè)SCAN──indexgrid.wxxr.com.cn，可以叫它域名，你不需要知道任何RAC環(huán)境內(nèi)部的信息，RAC內(nèi)部再有什么變化，客戶端不需要知道。RAC對(duì)客戶完全透明了。

要實(shí)現(xiàn)這個(gè)目的，其實(shí)后面有一系列的技術(shù)支持，包括 DNS、GNS、SCAN VIP、SCANListener、Listener，這些內(nèi)容都會(huì)在后面的章節(jié)中展開(kāi)。

2.7 HAIP（替代雙網(wǎng)卡綁定）

眾所皆知，RAC環(huán)境中有個(gè)私有網(wǎng)絡(luò)，這個(gè)網(wǎng)絡(luò)上跑的是心跳信息和Cache Fusion數(shù)據(jù)，私有網(wǎng)絡(luò)對(duì)于 RAC 的穩(wěn)定性、性能有著重要的意義。因此，我們會(huì)建議采用多塊網(wǎng)卡綁定的方式來(lái)搭建這個(gè)網(wǎng)絡(luò)。在不同的操作系統(tǒng)中，網(wǎng)卡綁定的名稱也不一樣，有的叫 bonding，有的叫 teaming 或 trunking。不管叫什么，其目的都是一樣的，都是通過(guò)冗余提供分散負(fù)載、故障切換的能力。

Oracle Grid從版本11.2.0.2開(kāi)始內(nèi)置了一個(gè)私有網(wǎng)卡的HA技術(shù)，就是HAIP。它和我們熟悉的多網(wǎng)卡綁定有所差異，采用的是一個(gè)multiport-listening-endpoint的架構(gòu)，每個(gè)私有網(wǎng)卡都會(huì)被分配一個(gè)HAIP地址，這個(gè)IP地址不需要提前定義，它是自動(dòng)生成的。最多支持4個(gè)私有網(wǎng)卡。

在默認(rèn)情況下，Oracle RAC會(huì)使用所有這些HAIP作為私有網(wǎng)絡(luò)的通信協(xié)議，提供負(fù)載均衡。如果一個(gè)私有網(wǎng)卡掛了，Oracle會(huì)自動(dòng)地把其HAIP切換到其他的網(wǎng)卡上去。

在 RAC 中，Oracle ASM（ASM 集群）以及其他的集群組件，比如 CSS、CRS、CTSS、EVM等都可以利用HAIP。

當(dāng)有多個(gè)私有網(wǎng)卡時(shí)，安裝過(guò)程如圖2-8所示。

圖2-8 有多塊網(wǎng)卡時(shí)的配置界面

說(shuō)明：HAIP 是Oracle Grid 11.2.0.2之后才有的，11.2.0.1是看不到的。

可以用操作系統(tǒng)的ip命令來(lái)查看這個(gè)HAIP。

節(jié)點(diǎn)1：

[grid@indexserver1 ～]$ /sbin/ip ad sh

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:43:c5:0d brd ff:ff:ff:ff:ff:ff

inet 192.168.1.70/24 brd 192.168.1.255 scope global eth0

inet 192.168.1.80/24 brd 192.168.1.255 scope global secondary eth0:1

inet 192.168.1.86/24 brd 192.168.1.255 scope global secondary eth0:2

inet6 fe80::7a2b:cbff:fe43:c50d/64 scope link

valid_lft forever preferred_lft forever

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:43:c5:0f brd ff:ff:ff:ff:ff:ff

inet 10.0.0.70/8 brd 10.255.255.255 scope global eth1

inet 169.254.177.128/16 brd 169.254.255.255 scope global eth1:1

inet6 fe80::7a2b:cbff:fe43:c50f/64 scope link

valid_lft forever preferred_lft forever

節(jié)點(diǎn)2：

[root@indexserver2 11.2.0.2]# ip ad sh

……

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:43:5c:0c brd ff:ff:ff:ff:ff:ff

inet 10.0.0.71/8 brd 10.255.255.255 scope global eth1

inet 169.254.127.239/16 brd 169.254.255.255 scope global eth1:1

inet6 fe80::7a2b:cbff:fe43:5c0c/64 scope link

valid_lft forever preferred_lft forever

……

節(jié)點(diǎn)3：

[root@indexserver3 11.2.0.2]# ip ad sh

……

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:43:76:26 brd ff:ff:ff:ff:ff:ff

inet 10.0.0.72/8 brd 10.255.255.255 scope global eth1

inet 169.254.59.18/16 brd 169.254.255.255 scope global eth1:1

inet6 fe80::7a2b:cbff:fe43:7626/64 scope link

valid_lft forever preferred_lft forever

……

節(jié)點(diǎn)4：

[root@indexserver4 11.2.0.2]# ip ad sh

……

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:42:79:37 brd ff:ff:ff:ff:ff:ff

inet 10.0.0.73/8 brd 10.255.255.255 scope global eth1

inet 169.254.162.84/16 brd 169.254.255.255 scope global eth1:1

inet6 fe80::7a2b:cbff:fe42:7937/64 scope link

valid_lft forever preferred_lft forever

……

在這個(gè) 4 節(jié)點(diǎn)的 RAC 中，網(wǎng)卡 eth1 用作私有網(wǎng)卡，這塊網(wǎng)卡上除了我們定義的私有 IP10.0.0.*之外，還有一個(gè)169.*.*.*的地址，這就是HAIP。

以上是只有一塊網(wǎng)卡做私有網(wǎng)卡的情況，下面是在Oracle 11.2.0.3中，有多塊網(wǎng)卡做私有網(wǎng)卡的情況。

第一個(gè)節(jié)點(diǎn)：

[root@promotiondbp ～]# ip ad sh

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:44:a3:46 brd ff:ff:ff:ff:ff:ff

inet 192.168.1.33/24 brd 192.168.1.255 scope global eth0

inet 192.168.1.40/24 brd 192.168.1.255 scope global secondary eth0:1

inet 192.168.1.39/24 brd 192.168.1.255 scope global secondary eth0:3

inet 192.168.1.37/24 brd 192.168.1.255 scope global secondary eth0:4

inet6 fe80::7a2b:cbff:fe44:a346/64 scope link

valid_lft forever preferred_lft forever

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:44:a3:48 brd ff:ff:ff:ff:ff:ff

inet 10.0.0.10/8 brd 10.255.255.255 scope global eth1

inet 169.254.8.96/18 brd 169.254.63.255 scope global eth1:1

inet 169.254.242.150/18 brd 169.254.255.255 scope global eth1:2

inet6 fe80::7a2b:cbff:fe44:a348/64 scope link

valid_lft forever preferred_lft forever

4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:44:a3:4a brd ff:ff:ff:ff:ff:ff

inet 10.0.0.1/8 brd 10.255.255.255 scope global eth2

inet 169.254.111.15/18 brd 169.254.127.255 scope global eth2:1

inet6 fe80::7a2b:cbff:fe44:a34a/64 scope link

valid_lft forever preferred_lft forever

5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:44:a3:4c brd ff:ff:ff:ff:ff:ff

inet 10.0.0.2/8 brd 10.255.255.255 scope global eth3

inet 169.254.167.173/18 brd 169.254.191.255 scope global eth3:1

inet6 fe80::7a2b:cbff:fe44:a34c/64 scope link

valid_lft forever preferred_lft forever

6: eth4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 00:10:18:9f:6f:c8 brd ff:ff:ff:ff:ff:ff

inet 10.0.0.3/8 brd 10.255.255.255 scope global eth4

7: eth5: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 00:10:18:9f:6f:ca brd ff:ff:ff:ff:ff:ff

inet 10.0.0.4/8 brd 10.255.255.255 scope global eth5

8: sit0: <NOARP> mtu 1480 qdisc noop

link/sit 0.0.0.0 brd 0.0.0.0

第二個(gè)節(jié)點(diǎn)：

[root@promotiondbs grid]# ip ad sh

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:44:8d:72 brd ff:ff:ff:ff:ff:ff

inet 192.168.1.34/24 brd 192.168.1.255 scope global eth0

inet 192.168.1.38/24 brd 192.168.1.255 scope global secondary eth0:1

inet 192.168.1.41/24 brd 192.168.1.255 scope global secondary eth0:2

inet6 fe80::7a2b:cbff:fe44:8d72/64 scope link

valid_lft forever preferred_lft forever

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:44:8d:74 brd ff:ff:ff:ff:ff:ff

inet 10.0.0.5/8 brd 10.255.255.255 scope global eth1

inet 169.254.43.98/18 brd 169.254.63.255 scope global eth1:1

inet 169.254.207.131/18 brd 169.254.255.255 scope global eth1:2

inet6 fe80::7a2b:cbff:fe44:8d74/64 scope link

valid_lft forever preferred_lft forever

4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:44:8d:76 brd ff:ff:ff:ff:ff:ff

inet 10.0.0.6/8 brd 10.255.255.255 scope global eth2

inet 169.254.126.213/18 brd 169.254.127.255 scope global eth2:1

inet6 fe80::7a2b:cbff:fe44:8d76/64 scope link

valid_lft forever preferred_lft forever

5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 78:2b:cb:44:8d:78 brd ff:ff:ff:ff:ff:ff

inet 10.0.0.7/8 brd 10.255.255.255 scope global eth3

inet 169.254.155.59/18 brd 169.254.191.255 scope global eth3:1

inet6 fe80::7a2b:cbff:fe44:8d78/64 scope link

valid_lft forever preferred_lft forever

6: eth4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 00:10:18:9f:70:48 brd ff:ff:ff:ff:ff:ff

inet 10.0.0.8/8 brd 10.255.255.255 scope global eth4

7: eth5: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 00:10:18:9f:70:4a brd ff:ff:ff:ff:ff:ff

inet 10.0.0.9/8 brd 10.255.255.255 scope global eth5

8: sit0: <NOARP> mtu 1480 qdisc noop

link/sit 0.0.0.0 brd 0.0.0.0

[root@promotiondbs grid]#

2.7.1 用 oficfg 無(wú)法得到 HAIP 的信息

注意，oficfg 命令雖然可以查看集群環(huán)境中各個(gè)網(wǎng)卡的用途，但是看不到 HAIP 的信息。比如：

[grid@indexserver1 ～]$ oifcfg getif

eth0 192.168.1.0 global public

eth1 10.0.0.0 global cluster_interconnect

[grid@indexserver1 ～]$ oifcfg iflist -p -n

eth0 192.168.1.0 PRIVATE 255.255.255.0

eth1 10.0.0.0 PRIVATE 255.0.0.0

eth1 169.254.0.0 UNKNOWN 255.255.0.0

2.7.2 確認(rèn) ASM 使用了 HAIP

要確認(rèn) ASM 使用了 HAIP ，可以從 ASM 的啟動(dòng)日志中看到，以 grid 用戶進(jìn)入$ORACLE_BASE下的目錄：

[grid@indexserver1 trace]$ cd /u01/app/grid/diag/asm/+asm/+ASM4/trace

[grid@indexserver1 trace]$ more alert_+ASM4.log

Thu Jul 05 13:07:52 2012

* instance_number obtained from CSS = 4, checking for the existence of node 0...

* node 0 does not exist.instance_number = 4

Starting ORACLE instance (normal)

LICENSE_MAX_SESSION = 0

LICENSE_SESSIONS_WARNING = 0

Private Interface 'eth1:1' configured from GPnP for use as a private interconnect.

[name='eth1:1', type=1, ip=169.254.177.128, mac=78-2b-cb-43-c5-0f, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_intercon

nect/62]

Public Interface 'eth0' configured from GPnP for use as a public interface.

[name='eth0', type=1, ip=192.168.1.70, mac=78-2b-cb-43-c5-0d, net=192.168.1.0/24, mask=255.255.255.0, use=public/1]

Shared memory segment for instance monitoring created

Picked latch-free SCN scheme 3

Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/11.2.0.2/grid/dbs/arch

Autotune of undo retention is turned on.

LICENSE_MAX_USERS = 0

SYS auditing is disabled

NOTE: Volume support enabled

Starting up:

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

With the Real Application Clusters and Automatic Storage Management options.

Usingparametersettingsinserver-sidespfile+DATA/indexgrid/asmparameterfile/registry.253.787835657

System parameters with non-default values:

large_pool_size　　　= 12M

instance_type　　　= "asm"

remote_login_passwordfile = "EXCLUSIVE"

asm_diskstring　　　= "ORCL:*"

asm_power_limit　　　= 1

diagnostic_dest　　　= "/u01/app/grid"

Cluster communication is configured to use the following interface(s) for this instance

169.254.177.128

cluster interconnect IPC version:Oracle UDP/IP (generic)

……

也可以通過(guò)視圖確認(rèn)：

[grid@indexserver1 trace]$ export ORACLE_SID=+ASM4

[grid@indexserver1 trace]$ sqlplus " / as sysdba"

SQL*Plus: Release 11.2.0.2.0 Production on Thu Jul 5 18:22:30 2012

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

With the Real Application Clusters and Automatic Storage Management options

SQL> select name,ip_address from v$cluster_interconnects;

NAME　　　IP_ADDRESS

--------------- ----------------

eth1:1　　　169.254.177.128

2.7.3 確認(rèn) RDBMS 數(shù)據(jù)庫(kù)使用 HAIP

同樣從數(shù)據(jù)庫(kù)的啟動(dòng)日志或者視圖中，也可以找到使用HAIP的證據(jù)。

來(lái)看一下數(shù)據(jù)庫(kù)的啟動(dòng)日志：

[oracle@indexserver1 ～]$ cd $ORACLE_BASE/diag /rdbms/wxxrdb/wxxrdb3/trace/

[oracle@indexserver1 trace]$ more alert_wxxrdb3.log

Thu Jul 05 16:50:30 2012

Adjusting the default value of parameter parallel_max_servers

from 640 to 135 due to the value of parameter processes (150)

Starting ORACLE instance (normal)

LICENSE_MAX_SESSION = 0

LICENSE_SESSIONS_WARNING = 0

Private Interface 'eth1:1' configured from GPnP for use as a private interconnect.

[name='eth1:1', type=1, ip=169.254.177.128, mac=78-2b-cb-43-c5-0f, net=169.254.0.0/16,mask=255.255.0.0, use=haip:cluster_intercon

nect/62]

Public Interface 'eth0' configured from GPnP for use as a public interface.

[name='eth0', type=1, ip=192.168.1.70, mac=78-2b-cb-43-c5-0d, net=192.168.1.0/24, mask=255.255.255.0, use=public/1]

Public Interface 'eth0:1' configured from GPnP for use as a public interface.

[name='eth0:1', type=1, ip=192.168.1.80, mac=78-2b-cb-43-c5-0d, net=192.168.1.0/24, mask=255.255.255.0, use=public/1]

Public Interface 'eth0:2' configured from GPnP for use as a public interface.

[name='eth0:2', type=1, ip=192.168.1.86, mac=78-2b-cb-43-c5-0d, net=192.168.1.0/24, mask=255.255.255.0, use=public/1]

Shared memory segment for instance monitoring created

Picked latch-free SCN scheme 3

Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/11.2.0.2/database/dbs/arch

Autotune of undo retention is turned on.

IMODE=BR

ILAT =28

LICENSE_MAX_USERS = 0

SYS auditing is disabled

Starting up:

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

With the Partitioning, Real Application Clusters, Oracle Label Security and Real Application Testing options.

Using parameter settings in client-side pfile /u01/app/oracle/cfgtoollogs/dbca/wxxrdb/ initwxxrdbTempOMF.ora on machine indexserver1

System parameters with non-default values:

Processes= 150

memory_target= 4800M

control_files= "/u01/app/oracle/cfgtoollogs/dbca/wxxrdb/tempControl.ctl"

db_block_size= 8192

compatible= "11.2.0.0.0"

db_create_file_dest= "+DATA"

undo_tablespace= "UNDOTBS1"

instance_number= 3

remote_login_passwordfile = "EXCLUSIVE"

db_domain= ""

dispatchers= "(PROTOCOL=TCP) (SERVICE=wxxrdbXDB)"

remote_listener= "indexgrid.wxxr.com.cn:1521"

audit_file_dest= "/u01/app/oracle/admin/wxxrdb/adump"

audit_trail= "DB"

db_name= "seeddata"

db_unique_name= "wxxrdb"

open_cursors= 300

diagnostic_dest= "/u01/app/oracle"

Cluster communication is configured to use the following interface(s) for this instance

169.254.177.128

cluster interconnect IPC version:Oracle UDP/IP (generic)

IPC Vendor 1 proto 2

通過(guò)視圖：

SQL> select name,ip_address from v$cluster_interconnects;

NAME　　　IP_ADDRESS

--------------- ----------------

eth1:1　　169.254.177.128

2.8 減少機(jī)器重啟——IO Fencing功能的增強(qiáng)

我們都知道Oracle會(huì)通過(guò)重啟故障節(jié)點(diǎn)的方式實(shí)現(xiàn)IO Fencing，解決腦裂風(fēng)暴。不過(guò)，每次重啟故障節(jié)點(diǎn)都會(huì)伴隨著集群的重構(gòu)動(dòng)作，集群重構(gòu)時(shí)會(huì)對(duì)數(shù)據(jù)庫(kù)進(jìn)行凍結(jié)（Freezing），凍結(jié)期間數(shù)據(jù)庫(kù)是無(wú)法接受連接、也無(wú)法完成外部請(qǐng)求的，客戶端會(huì)表現(xiàn)得掛住了。因此，不管是哪種規(guī)模的 RAC 環(huán)境，都應(yīng)該盡量減少機(jī)器的重啟。或者說(shuō)，應(yīng)該采用更加優(yōu)雅的解決方式。

但是，如何區(qū)分一個(gè)機(jī)器確實(shí)是因?yàn)楣收隙ロ憫?yīng)，還是因?yàn)樨?fù)載太重暫時(shí)無(wú)法響應(yīng)確實(shí)比較困難。Oracle具體是什么樣的算法我無(wú)從得知，但是Oracle 11.2 RAC中對(duì)于踢出節(jié)點(diǎn)的處理比之前溫柔了。

如果某個(gè)節(jié)點(diǎn)掛起了，對(duì)心跳沒(méi)有了響應(yīng)，那么Oracle會(huì)先嘗試著殺掉那些參與IO操作的進(jìn)程（也就是可能會(huì)造成數(shù)據(jù)破壞的進(jìn)程），比如DBWR、LGWR，如果Oracle不能干掉這些進(jìn)程，那么Grid就會(huì)重啟整個(gè)機(jī)器。

如果Oracle成功地干掉了這些新進(jìn)程，那么Oracle就會(huì)關(guān)閉Grid自己，然后再重啟Grid，而這個(gè)重啟是由ohasd控制的，或者說(shuō)由Grid的控制文件/etc/oracle/scls_scr/<hostname>/root/ohasdrun控制的。

總之，Oracle 11.2 Grid對(duì)于節(jié)點(diǎn)的重啟不再像以前那么粗暴和絕對(duì)了，一定程度上減少了機(jī)器重啟的數(shù)量。

到目前為止，安裝過(guò)程中所涉及的新概念基本都涵蓋了，接下來(lái)我們就要開(kāi)始深入到Grid的內(nèi)部了。不過(guò)，我個(gè)人建議，為了加深對(duì)這一章內(nèi)容的把握，你最好把之前裝的Oracle Grid全部刪掉，靠自己的掌握重新裝一遍，這樣印象會(huì)很深刻。

那我們就看看該如何干凈地刪除一個(gè)Grid。

2.9 Grid的卸載

Oracle 并沒(méi)有提供一個(gè)圖形化的卸載工具，或許以后的版本會(huì)有。要想干凈地卸載，我們也不能簡(jiǎn)單地把Oracle目錄刪除了事。在Grid安裝目錄下有一個(gè)deinstall目錄，這里的deinstall腳本用于卸載Grid。當(dāng)我們要對(duì)整個(gè)集群環(huán)境進(jìn)行重構(gòu)或者刪除掉RAC時(shí)，我們會(huì)用到它。

刪除 Grid 的操作步驟很簡(jiǎn)單直觀，我們下面刪除一個(gè) 4 節(jié)點(diǎn)組成的集群，每個(gè)節(jié)點(diǎn)都有自己的Grid HOME和Oracle Database HOME。正確的卸載順序應(yīng)該是這樣的。

2.9.1 關(guān)閉數(shù)據(jù)庫(kù)和資源

首先，關(guān)閉集群各個(gè)節(jié)點(diǎn)上的所有數(shù)據(jù)庫(kù)以及其他資源。這一步需要以root身份在每一個(gè)節(jié)點(diǎn)上進(jìn)行：

[root@indexserver1 ～]# crsctl stop crs

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on'indexserver2'

CRS-2673: Attempting to stop 'ora.crsd' on 'indexserver2'

CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'indexserver2'

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.cvu' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.indexserver3.vip' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.registry.acfs' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.DATA.dg' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'indexserver2'

CRS-2677: Stop of 'ora.indexserver3.vip' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.indexserver2.vip' on 'indexserver2'

CRS-2677: Stop of 'ora.indexserver2.vip' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.indexserver4.vip' on 'indexserver2'

CRS-2677: Stop of 'ora.indexserver4.vip' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.cvu' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.scan3.vip' on 'indexserver2'

CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.scan1.vip' on 'indexserver2'

CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.scan2.vip' on 'indexserver2'

CRS-2677: Stop of 'ora.scan3.vip' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.indexserver1.vip' on 'indexserver2'

CRS-2677: Stop of 'ora.scan1.vip' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.scan2.vip' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.indexserver1.vip' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.registry.acfs' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.DATA.dg' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.asm' on 'indexserver2'

CRS-2677: Stop of 'ora.asm' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.ons' on 'indexserver2'

CRS-2677: Stop of 'ora.ons' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.net1.network' on 'indexserver2'

CRS-2677: Stop of 'ora.net1.network' on 'indexserver2' succeeded

CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'indexserver2' has completed

CRS-2677: Stop of 'ora.crsd' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.crf' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.ctssd' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.evmd' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.asm' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.mdnsd' on 'indexserver2'

CRS-2677: Stop of 'ora.asm' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'indexserver2'

CRS-2677: Stop of 'ora.crf' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.mdnsd' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.evmd' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.drivers.acfs' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.ctssd' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on 'indexserver2'

CRS-2677: Stop of 'ora.cssd' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.gipcd' on 'indexserver2'

CRS-2673: Attempting to stop 'ora.diskmon' on 'indexserver2'

CRS-2677: Stop of 'ora.gipcd' on 'indexserver2' succeeded

CRS-2673: Attempting to stop 'ora.gpnpd' on 'indexserver2'

CRS-2677: Stop of 'ora.diskmon' on 'indexserver2' succeeded

CRS-2677: Stop of 'ora.gpnpd' on 'indexserver2' succeeded

CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'indexserver2' has completed

CRS-4133: Oracle High Availability Services has been stopped.

在每個(gè)節(jié)點(diǎn)上都要執(zhí)行這個(gè)腳本，然后開(kāi)始執(zhí)行deinstall。

2.9.2 用 deinstall 卸載

接下來(lái)，在安裝Grid的第一個(gè)節(jié)點(diǎn)上，以grid身份運(yùn)行這個(gè)deinstall腳本，這個(gè)腳本會(huì)運(yùn)行一系列的檢查，同時(shí)會(huì)提出一系列問(wèn)題請(qǐng)你確認(rèn)，最后才會(huì)真正地開(kāi)始卸載工作。

[grid@indexserver1 ～]$ cd /u01/app/11.2.0/grid/deinstall/

[grid@indexserver1 deinstall]$ ./deinstall

Checking for required files and bootstrapping ...

Please wait ...

Location of logs /u01/app/oraInventory/logs/

############ ORACLE DEINSTALL & DECONFIG TOOL START ############

######################### CHECK OPERATION START #########################

Install check configuration START

Checking for existence of the Oracle home location /u01/app/11.2.0.2/grid

Oracle Home type selected for de-install is: CRS

Oracle Base selected for de-install is: /u01/app/grid

Checking for existence of central inventory location /u01/app/oraInventory

Checking for existence of the Oracle Grid Infrastructure home /u01/app/11.2.0.2/grid

The following nodes are part of this cluster: indexserver1,indexserver2,indexserver3,indexserver4

Install check configuration END

Skipping Windows and .NET products configuration check

Checking Windows and .NET products configuration END

Traces log file: /u01/app/oraInventory/logs//crsdc.log

Enter an address or the name of the virtual IP used on node "indexserver1"[null]

indexserver1-vip

The following information can be collected by running "/sbin/ifconfig -a" on node"indexserver1"

繼續(xù)輸入其他節(jié)點(diǎn)的VIP名字。

Enter the IP netmask of Virtual IP "192.168.123.210" on node "indexserver1"[255.255.255.0]

Enter the network interface name on which the virtual IP address "192.168.123.210" is active

eth0

Enter an address or the name of the virtual IP used on node "indexserver2"[192.168.123.210]

indexserver2-vip

The following information can be collected by running "/sbin/ifconfig -a" on node"indexserver2"

Enter the IP netmask of Virtual IP "192.168.123.211" on node "indexserver2"[255.255.255.0]

Enter the network interface name on which the virtual IP address "192.168.123.211" is active[eth0]

Enter an address or the name of the virtual IP used on node "indexserver3"[192.168.123.211]

indexserver3-vip

The following information can be collected by running "/sbin/ifconfig -a" on node"indexserver3"

Enter the IP netmask of Virtual IP "192.168.123.212" on node "indexserver3"[255.255.255.0]

Enter the network interface name on which the virtual IP address "192.168.123.212" is active[eth0]

Enter an address or the name of the virtual IP used on node "indexserver4"[192.168.123.212]

indexserver4-vip

The following information can be collected by running "/sbin/ifconfig -a" on node"indexserver4"

Enter the IP netmask of Virtual IP "192.168.123.213" on node "indexserver4"[255.255.255.0]

Enter the network interface name on which the virtual IP address "192.168.123.213" is active[eth0]

Enter an address or the name of the virtual IP[]

Network Configuration check config START

Network de-configuration trace file location: /u01/app/oraInventory/logs/netdc_check2012-06-20_05-50-01-PM.log

Specify all RAC listeners (do not include SCAN listener) that are to be de-configured[LISTENER,LISTENER_SCAN3,LISTENER_SCAN2,LISTENER_SCAN1]:

Network Configuration check config END

Asm Check Configuration START

ASM de-configuration trace file location: /u01/app/oraInventory/logs/asmcadc_check2012-06-20_05-50-05-PM.log

ASM configuration was not detected in this Oracle home.Was ASM configured in this Oracle home (y|n) [n]: y

Specify the ASM Diagnostic Destination [ ]:

Specify the diskstring []: ORCL:*

Specify the diskgroups that are managed by this ASM instance []: DATA

De-configuring ASM will drop the diskgroups at cleanup time.Do you want deconfig tool to drop the diskgroups y|n [y]:

######################### CHECK OPERATION END #########################

####################### CHECK OPERATION SUMMARY #######################

Oracle Grid Infrastructure Home is: /u01/app/11.2.0.2/grid

The cluster node(s) on which the Oracle home de-installation will be performed are:indexserver1,indexserver2,indexserver3,indexserver4

Oracle Home selected for de-install is: /u01/app/11.2.0.2/grid

Inventory Location where the Oracle home registered is: /u01/app/oraInventory

Skipping Windows and .NET products configuration check

Following RAC listener(s) will be de-configured: LISTENER,LISTENER_SCAN3,LISTENER_SCAN2,LISTENER_SCAN1

ASM instance will be de-configured from this Oracle home

Do you want to continue (y - yes, n - no)? [n]: y

A log of this session will be written to: '/u01/app/oraInventory/logs/deinstall_deconfig2012-06-20_05-48-53-PM.out'

Any error messages from this session will be written to: '/u01/app/oraInventory/logs/deinstall_deconfig2012-06-20_05-48-53-PM.err'

######################## CLEAN OPERATION START ########################

ASM de-configuration trace file location: /u01/app/oraInventory/logs/asmcadc_clean2012-06-20_05-50-53-PM.log

ASM Clean Configuration START

ASM Clean Configuration END

Network Configuration clean config START

Network de-configuration trace file location: /u01/app/oraInventory/logs/netdc_clean2012-06-20_05-50-55-PM.log

De-configuring RAC listener(s): LISTENER,LISTENER_SCAN3,LISTENER_SCAN2,LISTENER_SCAN1

De-configuring listener: LISTENER

Stopping listener: LISTENER

Listener stopped successfully.

Listener de-configured successfully.

De-configuring listener: LISTENER_SCAN3

Stopping listener: LISTENER_SCAN3

Listener stopped successfully.

Listener de-configured successfully.

De-configuring listener: LISTENER_SCAN2

Stopping listener: LISTENER_SCAN2

Listener stopped successfully.

Listener de-configured successfully.

De-configuring listener: LISTENER_SCAN1

Stopping listener: LISTENER_SCAN1

Listener stopped successfully.

Listener de-configured successfully.

De-configuring Naming Methods configuration file on all nodes...

Naming Methods configuration file de-configured successfully.

De-configuring Local Net Service Names configuration file on all nodes...

Local Net Service Names configuration file de-configured successfully.

De-configuring Directory Usage configuration file on all nodes...

Directory Usage configuration file de-configured successfully.

De-configuring backup files on all nodes...

Backup files de-configured successfully.

The network configuration has been cleaned up successfully.

Network Configuration clean config END

---------------------------------------->

The deconfig command below can be executed in parallel on all the remote nodes.Execute the command on the local node after the execution completes on all the remote nodes.

Run the following command as the root user or the administrator on node "indexserver3".

/tmp/deinstall2012-06-20_05-48-48PM/perl/bin/perl -I/tmp/deinstall2012-06-20_05-48-48 PM/perl/lib -I/tmp/deinstall2012-06-20_05-48-48PM/crs/install /tmp/deinstall2012-06-20_05-48-48PM/crs/install/rootcrs.pl -force -deconfig -paramfile "/tmp/deinstall2012-06-20_05-48-48PM/response/deinstall_Ora11g_gridinfrahome1.rsp"

Run the following command as the root user or the administrator on node "indexserver2".

/tmp/deinstall2012-06-20_05-48-48PM/perl/bin/perl -I/tmp/deinstall2012-06-20_05-48- 48PM/perl/lib -I/tmp/deinstall2012-06-20_05-48-48PM/crs/install /tmp/deinstall2012-06-20_ 05-48-48PM/crs/install/rootcrs.pl -force -deconfig -paramfile "/tmp/deinstall2012-06-20_05-48-48PM/response/deinstall_Ora11g_gridinfrahome1.rsp"

Run the following command as the root user or the administrator on node "indexserver4".

/tmp/deinstall2012-06-20_05-48-48PM/perl/bin/perl -I/tmp/deinstall2012-06-20_05-48- 48PM/perl/lib -I/tmp/deinstall2012-06-20_05-48-48PM/crs/install /tmp/deinstall2012-06-20_ 05-48-48PM/crs/install/rootcrs.pl -force -deconfig -paramfile "/tmp/deinstall2012-06-20_05-48-48PM/response/deinstall_Ora11g_gridinfrahome1.rsp"

Run the following command as the root user or the administrator on node "indexserver1".

/tmp/deinstall2012-06-20_05-48-48PM/perl/bin/perl -I/tmp/deinstall2012-06-20_05-48- 48PM/perl/lib -I/tmp/deinstall2012-06-20_05-48-48PM/crs/install /tmp/deinstall2012-06-20_ 05-48-48PM/crs/install/rootcrs.pl -force -deconfig -paramfile "/tmp/deinstall2012-06-20_ 05-48-48PM/response/deinstall_Ora11g_gridinfrahome1.rsp" -lastnode

Press Enter after you finish running the above commands

<----------------------------------------

這里給出的提示的意思是：在每個(gè)節(jié)點(diǎn)上運(yùn)行這些腳本，然后再回到這個(gè)界面中按下回車(chē)鍵繼續(xù)。這幾個(gè)腳本都是一樣的，只是在最后一個(gè)節(jié)點(diǎn)上多了一個(gè)-lastnode參數(shù)。

下面就在每個(gè)節(jié)點(diǎn)上以root身份執(zhí)行這些腳本，這些腳本的輸出比較長(zhǎng)，這里就不列出了。在前3個(gè)節(jié)點(diǎn)上執(zhí)行這個(gè)：

[root@indexserver3 ～]# /tmp/deinstall2012-06-20_05-48-48PM/perl/bin/perl -I/tmp/deinstall 2012-06-20_05-48-48PM/perl/lib -I/tmp/deinstall2012-06-20_05-48-48PM/crs/install /tmp/ deinstall2012-06-20_05-48-48PM/crs/install/rootcrs.pl -force -deconfig -paramfile "/tmp/deinstall2012-06-20_05-48-48PM/response/deinstall_Ora11g_gridinfrahome1.rsp"

最后一個(gè)節(jié)點(diǎn)多了一個(gè)lastnode，其他都一樣：

[root@indexserver1 ～]# /tmp/deinstall2012-06-20_05-48-48PM/perl/bin/perl -I/tmp/deinsta ll2012-06-20_05-48-48PM/perl/lib -I/tmp/deinstall2012-06-20_05-48-48PM/crs/install /tmp/ deinstall2012-06-20_05-48-48PM/crs/install/rootcrs.pl -force -deconfig -paramfile "/tmp/ deinstall2012-06-20_05-48-48PM/response/deinstall_Ora11g_gridinfrahome1.rsp" –lastnode

4個(gè)節(jié)點(diǎn)上都執(zhí)行完之后，再返回到之前執(zhí)行deinstall的窗口，按下回車(chē)鍵，繼續(xù)后續(xù)的卸載，這次就要開(kāi)始刪除目錄了：

Removing Windows and .NET products configuration END

Oracle Universal Installer clean START

Detach Oracle home '/u01/app/11.2.0.2/grid' from the central inventory on the local node :Done

Delete directory '/u01/app/11.2.0.2/grid' on the local node : Done

Delete directory '/u01/app/grid' on the local node : Done

Detach Oracle home '/u01/app/11.2.0.2/grid' from the central inventory on the remote nodes 'indexserver2,indexserver3,indexserver4' : Done

Delete directory '/u01/app/11.2.0.2/grid' on the remote nodes 'indexserver2,indexserver3, indexserver4' : Done

Delete directory '/u01/app/grid' on the remote nodes 'indexserver2' : Done

Delete directory '/u01/app/grid' on the remote nodes 'indexserver3' : Done

Delete directory '/u01/app/grid' on the remote nodes 'indexserver4' : Done

Oracle Universal Installer cleanup was successful.

Oracle Universal Installer clean END

Oracle install clean START

Clean install operation removing temporary directory '/tmp/deinstall2012-06-20_05-48- 48PM' on node 'indexserver1'

Clean install operation removing temporary directory '/tmp/deinstall2012-06-20_05-48- 48PM' on node 'indexserver2'

Clean install operation removing temporary directory '/tmp/deinstall2012-06-20_05-48- 48PM' on node 'indexserver3'

Clean install operation removing temporary directory '/tmp/deinstall2012-06-20_05-48- 48PM' on node 'indexserver4'

Oracle install clean END

######################### CLEAN OPERATION END #########################

####################### CLEAN OPERATION SUMMARY #######################

ASM instance was de-configured successfully from the Oracle home

Following RAC listener(s) were de-configured successfully: LISTENER,LISTENER_SCAN3,LISTENER_SCAN2,LISTENER_SCAN1

Oracle Clusterware is stopped and successfully de-configured on node "indexserver3"

Oracle Clusterware is stopped and successfully de-configured on node "indexserver2"

Oracle Clusterware is stopped and successfully de-configured on node "indexserver1"

Oracle Clusterware is stopped and successfully de-configured on node "indexserver4"

Oracle Clusterware is stopped and de-configured successfully.

Skipping Windows and .NET products configuration clean

Successfully detached Oracle home '/u01/app/11.2.0.2/grid' from the central inventory on the local node.

Successfully deleted directory '/u01/app/11.2.0.2/grid' on the local node.

Successfully deleted directory '/u01/app/grid' on the local node.

Successfully detached Oracle home '/u01/app/11.2.0.2/grid' from the central inventory on the remote nodes 'indexserver2,indexserver3,indexserver4'.

Successfully deleted directory '/u01/app/11.2.0.2/grid' on the remote nodes 'indexserver2,indexserver3,indexserver4'.

Successfully deleted directory '/u01/app/grid' on the remote nodes 'indexserver2'.

Successfully deleted directory '/u01/app/grid' on the remote nodes 'indexserver3'.

Successfully deleted directory '/u01/app/grid' on the remote nodes 'indexserver4'.

Oracle Universal Installer cleanup was successful.

Oracle deinstall tool successfully cleaned up temporary directories.

#######################################################################

############# ORACLE DEINSTALL & DECONFIG TOOL END #############

2.9.3 卸載后的檢查確認(rèn)

Deinstall腳本執(zhí)行完后，Grid的卸載就算完成了。接下來(lái)要做一些檢查，確保卸載沒(méi)有問(wèn)題。我們從以下幾個(gè)方面進(jìn)行檢查。

（1）首先，檢查 4 個(gè)節(jié)點(diǎn)運(yùn)行上面這些命令時(shí)屏幕上輸出的日志內(nèi)容，確保沒(méi)有重要的錯(cuò)誤。

（2）其次，檢查集群各節(jié)點(diǎn)的/etc/inittab文件，ohasd的內(nèi)容應(yīng)該被刪除了，也就是不應(yīng)該有類似下面的內(nèi)容：

h1:35:respawn:/etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null

（3）每個(gè)節(jié)點(diǎn)都不應(yīng)該有ora或者d.bin的進(jìn)程運(yùn)行，否則應(yīng)該用kill -9干掉它。

[root@indexserver3 ～]# ps -edf|grep ora

root　21238 17607　0 10:25 pts/0　00:00:00 grep ora

[root@indexserver3 ～]# ps -edf|grep d.bin

root　21240 17607　0 10:25 pts/0　00:00:00 grep d.bin

（4）看一下/etc/oracle這個(gè)目錄，這個(gè)目錄下的那些.loc文件已經(jīng)被重命名為.orig。

[root@indexserver3 ～]# cd /etc/oracle/

[root@indexserver3 oracle]# ls

ocr.loc.orig olr.loc.orig

2.9.4 刪除目錄

如果上面的這些檢查都沒(méi)問(wèn)題，現(xiàn)在就可以刪除$GRID_HOME 下的所有內(nèi)容了。不過(guò)這個(gè)目錄應(yīng)該已經(jīng)被清空了。我們也可以繼續(xù)清空$ORACLE_HOME目錄的內(nèi)容。

2.9.5 刪除 ASM 磁盤(pán)

現(xiàn)在ASM磁盤(pán)還在，那么應(yīng)該也把它們刪除，從而開(kāi)始一個(gè)全新的、干凈的安裝。這里需要在4個(gè)節(jié)點(diǎn)上以root身份運(yùn)行以下命令：

[root@indexserver1 ～]# ls /dev/oracleasm/disks/

WXXRINDEX1

[root@indexserver1 ～]# oracleasm listdisks

WXXRINDEX1

[root@indexserver1 ～]# oracleasm deletedisk wxxrindex1

Clearing disk header: done

Dropping disk: done

[root@indexserver1 ～]# oracleasm listdisks

[root@indexserver1 ～]#

最后，再來(lái)個(gè)dd：

dd if=/dev/zero of=/dev/emcpowere1 bs=1024 count=1000000

好了，到這里Grid的卸載就全部做完了，接下來(lái)重新部署一個(gè)Grid吧。

2.10 小結(jié)

本章是上一章的內(nèi)容延續(xù)，本來(lái)這兩章是合并在一起的。但是一章 80 頁(yè)之巨的安裝手冊(cè)顯然會(huì)給讀者帶來(lái)太大的壓力，用做產(chǎn)品的話說(shuō)就是“用戶體驗(yàn)很差”。所以我把那些外延性的介紹抽了出來(lái)，獨(dú)立成一章。

這一章是圍繞著安裝中遇到的那些新界面、新名詞展開(kāi)的。Oracle 11gR2的安裝界面變化很大，當(dāng)然最大的變化是配色風(fēng)格，雖然不太好看，但也不能否定它在我心中“一個(gè)偉大產(chǎn)品”的定位。

于是在這一章中，我們談到了“角色分離”的管理思想，也就不詫異為什么Oracle 11gR2中有那么多用戶和用戶組。接著我們討論了SSH用戶等價(jià)、NTP時(shí)間同步、HAIP多網(wǎng)卡綁定，這些都是Oracle為了“降低部署難度”而做出的種種整合措施，很是符合Oracle“開(kāi)箱見(jiàn)云”的宣傳口號(hào)。但就個(gè)人而言，這些“簡(jiǎn)化”對(duì)于 DBA新人來(lái)說(shuō)并不是好事，因?yàn)樗麄儠?huì)錯(cuò)過(guò)很多長(zhǎng)進(jìn)的機(jī)會(huì)，也許這就是進(jìn)步的代價(jià)。

安裝過(guò)程還沒(méi)有完，還差最后一步——徹底檢查，要確保我們之前的努力沒(méi)有白費(fèi)，能夠放心地把它丟到生產(chǎn)線上去發(fā)光散熱。這就是下一章的內(nèi)容。

官术网_书友最值得收藏!

大話Oracle Grid：云時(shí)代的RAC

第2章 安裝引發(fā)的思考

2.1 怎么有這么多用戶和用戶組

2.1.1 老朋友

2.1.2 集群環(huán)境的用戶組

2.1.3 GI owner 和 DB owner 是否有必要分開(kāi)

2.2 DBCA不識(shí)別集群環(huán)境的解決辦法

2.3 為什么不配時(shí)間服務(wù)了

2.3.1 使用 NTP 服務(wù)

2.3.2 使用 CTSS 服務(wù)

2.3.3 CTSS 和 NTP 的關(guān)系

2.4 IPMI是什么

2.5 ORACLE_BASE和ORACLE_HOME的區(qū)別

2.5.1 OFA 和軟件安裝

2.5.2 ORACLE_HOME 是共享還是本地

2.6 SCAN

2.7 HAIP（替代雙網(wǎng)卡綁定）

2.7.1 用 oficfg 無(wú)法得到 HAIP 的信息

2.7.2 確認(rèn) ASM 使用了 HAIP

2.7.3 確認(rèn) RDBMS 數(shù)據(jù)庫(kù)使用 HAIP

2.8 減少機(jī)器重啟——IO Fencing功能的增強(qiáng)

2.9 Grid的卸載

2.9.1 關(guān)閉數(shù)據(jù)庫(kù)和資源

2.9.2 用 deinstall 卸載

2.9.3 卸載后的檢查確認(rèn)

2.9.4 刪除目錄

2.9.5 刪除 ASM 磁盤(pán)

2.10 小結(jié)

第2章安裝引發(fā)的思考