Error while checking the status of Oracle Cluster ware
Hi
I was trying to install the database using dbca after setting up the grid and database software on LINUX x86-64 RHEL 5.7 machine. The database software version is 11.2.0.3. It throwing the error regarding the connectivity of clusterware. So I checked the status of clusterware.
-bash-3.2$ ./crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.
-bash-3.2$
But when I ran below one:
-bash-3.2$ ./crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE sfv9699 Started
ora.cluster_interconnect.haip
1 ONLINE ONLINE sfv9699
ora.crf
1 ONLINE ONLINE sfv9699
ora.crsd
1 ONLINE OFFLINE
ora.cssd
1 ONLINE ONLINE sfv9699
ora.cssdmonitor
1 ONLINE ONLINE sfv9699
ora.ctssd
1 ONLINE ONLINE sfv9699 OBSERVER
ora.diskmon
1 OFFLINE OFFLINE
ora.drivers.acfs
1 ONLINE ONLINE sfv9699
ora.evmd
1 ONLINE INTERMEDIATE sfv9699
ora.gipcd
1 ONLINE ONLINE sfv9699
ora.gpnpd
1 ONLINE ONLINE sfv9699
ora.mdnsd
1 ONLINE ONLINE sfv9699
So i saw that the crsd having some issue. I checked the alert log and crsd log. Below are the output.
Alert <server_name>.log
----------------------------------
2012-10-20 15:37:51.408
[ohasd(3694)]CRS-2765:Resource 'ora.crsd' has failed on server 'sfv9699'.
2012-10-20 15:37:52.968
[crsd(5188)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /oracle2/app/11.2.0/grid/log/sfv9699/crsd/crsd.log.
2012-10-20 15:37:52.984
[crsd(5188)]CRS-0804:Cluster Ready Service aborted due to Oracle Cluster Registry error [PROC-26: Error while accessing the physical storage
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1000 (oinstall), current egid = 10002 (dba)
]. Details at (:CRSD00111:) in /oracle2/app/11.2.0/grid/log/sfv9699/crsd/crsd.log.
2012-10-20 15:37:53.471
[ohasd(3694)]CRS-2765:Resource 'ora.crsd' has failed on server 'sfv9699'.
2012-10-20 15:37:53.472
[ohasd(3694)]CRS-2771:Maximum restart attempts reached for resource 'ora.crsd'; will not restart.
CRSD.log
------
2012-10-20 15:37:52.456: [ CRSMAIN][3563381328] Checking the OCR device
2012-10-20 15:37:52.457: [ CRSMAIN][3563381328] Sync-up with OCR
2012-10-20 15:37:52.457: [ CRSMAIN][3563381328] Connecting to the CSS Daemon
2012-10-20 15:37:52.457: [ CRSMAIN][3563381328] Getting local node number
2012-10-20 15:37:52.459: [ CRSMAIN][3563381328] Initializing OCR
[ CLWAL][3563381328]clsw_Initialize: OLR initlevel [70000]
2012-10-20 15:37:52.897: [ OCRASM][3563381328]proprasmo: Error in open/create file in dg [DATA]
[ OCRASM][3563381328]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=27140, loc=kgfokge
2012-10-20 15:37:52.898: [ OCRASM][3563381328]ASM Error Stack : ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1000 (oinstall), current egid = 10002 (dba)
2012-10-20 15:37:52.967: [ OCRASM][3563381328]proprasmo: kgfoCheckMount returned [7]
2012-10-20 15:37:52.967: [ OCRASM][3563381328]proprasmo: The ASM instance is down
2012-10-20 15:37:52.968: [ OCRRAW][3563381328]proprioo: Failed to open [+DATA]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2012-10-20 15:37:52.968: [ OCRRAW][3563381328]proprioo: No OCR/OLR devices are usable
2012-10-20 15:37:52.968: [ OCRASM][3563381328]proprasmcl: asmhandle is NULL
2012-10-20 15:37:52.969: [ GIPC][3563381328] gipcCheckInitialization: possible incompatible non-threaded init from [prom.c : 690], original from [clsss.c : 5326]
2012-10-20 15:37:52.975: [ default][3563381328]clsvactversion:4: Retrieving Active Version from local storage.
2012-10-20 15:37:52.978: [ CSSCLNT][3563381328]clssgsgrppubdata: group (ocr_SFV9699-cluster) not found
2012-10-20 15:37:52.978: [ OCRRAW][3563381328]proprio_repairconf: Failed to retrieve the group public data. CSS ret code [20]
2012-10-20 15:37:52.981: [ OCRRAW][3563381328]proprioo: Failed to auto repair the OCR configuration.
2012-10-20 15:37:52.981: [ OCRRAW][3563381328]proprinit: Could not open raw device
2012-10-20 15:37:52.981: [ OCRASM][3563381328]proprasmcl: asmhandle is NULL
2012-10-20 15:37:52.983: [ OCRAPI][3563381328]a_init:16!: Backend init unsuccessful : [26]
2012-10-20 15:37:52.984: [ CRSOCR][3563381328] OCR context init failure. Error: PROC-26: Error while accessing the physical storage
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1000 (oinstall), current egid = 10002 (dba)
2012-10-20 15:37:52.984: [ CRSMAIN][3563381328] Created alert : (:CRSD00111:) : Could not init OCR, error: PROC-26: Error while accessing the physical storage
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1000 (oinstall), current egid = 10002 (dba)
2012-10-20 15:37:52.984: [ CRSD][3563381328][PANIC] CRSD exiting: Could not init OCR, code: 26
2012-10-20 15:37:52.984: [ CRSD][3563381328] Done.
=======================
I see in the above log that saying ASM instance is down and failed to open +DATA .
But the asm instance up and running
SQL> select instance_name,status from v$instance;
INSTANCE_NAME STATUS
---------------- ------------
+ASM1 STARTED
And we havent created any disk named DATA before the installation. We have created only below two disks
SQL> select name,header_status from v$asm_disk;
NAME HEADER_STATUS
------------------------------ --------------------------
ASM_DATA MEMBER
FLASH_RECOVERY MEMBER
But I am seeing a diskgroup in the v$asm_diskgroup which we havent created.
SQL> select name,state from v$asm_diskgroup;
NAME STATE
------------------------------ -----------
DATA MOUNTED
Ya this is a second time installtion. In the first installtion we created the asmdisk as DATA. But later everything (RAW device ) was formatted and this new disks has been created and installtion again started
[root@SFV9699 bin]# oracleasm listdisks
ASM_DATA
FLASH_RECOVERY
Seems like its trying to read the old disk DATA.
we have done asmscanning too with oracleasm scan disks. but no use.
Where I can remove the old entry of DATA disk.
It would be a great if a quick response get.
Thanks
SHIYAS M