Instance node failing to see ASM and failing to start
Hi,
I'm trying to troubleshoot why the database connection to ASM began failing last Tuesday and aborted on Wednesday (see alert log snippet below).
I'd like additional troubleshooting suggestions, ideally so I can test ASM connectivty for the instance and determine where the failure point is.
alert log excerpts; (*** = comments)
Tue Aug 16 02:00:00 2011
Closing scheduler window
Closing Resource Manager plan via scheduler window
Clearing Resource Manager plan via parameter
Tue Aug 16 02:00:35 2011
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
NOTE: Deferred communication with ASM instance
Errors in file /u01/app/oracle/diag/rdbms/rpprod/RPPROD1/trace/RPPROD1_ora_13026.trc:
ORA-15055: unable to connect to ASM instance
ORA-01034: ORACLE not available
ORA-27123: unable to attach to shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 4653061
Additional information: 10
ORA-15055: unable to connect to ASM instance
ORA-01034: ORACLE not available
ORA-27123: unable to attach to shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 4653061
Additional information: 10
NOTE: deferred map free for map id 21812
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
NOTE: Deferred communication with ASM instance
Errors in file /u01/app/oracle/diag/rdbms/rpprod/RPPROD1/trace/RPPROD1_ora_13026.trc:
ORA-15055: unable to connect to ASM instance
ORA-01034: ORACLE not available
ORA-27123: unable to attach to shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 4653061
Additional information: 10
ORA-15055: unable to connect to ASM instance
ORA-01034: ORACLE not available
ORA-27123: unable to attach to shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 4653061
Additional information: 10
NOTE: deferred map free for map id 21812
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
NOTE: Deferred communication with ASM instance
Errors in file /u01/app/oracle/diag/rdbms/rpprod/RPPROD1/trace/RPPROD1_ora_13026.trc:
ORA-15055: unable to connect to ASM instance
ORA-01034: ORACLE not available
ORA-27123: unable to attach to shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 4653061
Additional information: 10
ORA-15055: unable to connect to ASM instance
ORA-01034: ORACLE not available
ORA-27123: unable to attach to shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 4653061
Additional information: 10
NOTE: deferred map free for map id 21813
**** and then about a billion of these identical messages flood the log....
Tue Aug 16 02:01:28 2011
WARNING: ASM communication error: op 18 state 0x40 (1034)
ERROR: slave communication error with ASM
NOTE: Deferred communication with ASM instance
Errors in file /u01/app/oracle/diag/rdbms/rpprod/RPPROD1/trace/RPPROD1_pmon_16030.trc:
ORA-01034: ORACLE not available
ORA-27123: unable to attach to shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 4653061
Additional information: 10
**** until Wednesday night at about 1020pm
Wed Aug 17 22:21:45 2011
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
Errors in file /u01/app/oracle/diag/rdbms/rpprod/RPPROD1/trace/RPPROD1_dbw0_16168.trc:
ORA-01148: cannot refresh file size for datafile 6
ORA-01110: data file 6: '+DATA01/rpprod/datafile/undotbs2.265.753364203'
ORA-15055: unable to connect to ASM instance
ORA-01034: ORACLE not available
ORA-27123: unable to attach to shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 4653061
Additional information: 10
Errors in file /u01/app/oracle/diag/rdbms/rpprod/RPPROD1/trace/RPPROD1_dbw0_16168.trc:
ORA-63997: file size refresh failed
DBW0 (ospid: 16168): terminating the instance due to error 63997
Wed Aug 17 22:21:45 2011
System state dump requested by (instance=1, osid=16168 (DBW0)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/rpprod/RPPROD1/trace/RPPROD1_diag_16115.trc
Wed Aug 17 22:21:46 2011
ORA-1092 : opitsk aborting process
Instance terminated by DBW0, pid = 16168
**** PS. rdbms patch levels look fine
oracle@hostname1.domain:[RPPROD]:/home/oracle >
$ opatch lsinventory
Invoking OPatch 11.2.0.1.1
Oracle Interim Patch Installer version 11.2.0.1.1
Copyright (c) 2009, Oracle Corporation. All rights reserved.
Oracle Home : /u01/app/oracle/product/11.2.0/db_1
Central Inventory : /u01/app/oraInventory
from : /etc/oraInst.loc
OPatch version : 11.2.0.1.1
OUI version : 11.2.0.2.0
OUI location : /u01/app/oracle/product/11.2.0/db_1/oui
Log file location : /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch2011-08-18_13-58-55PM.log
Patch history file: /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch_history.txt
Lsinventory Output file location : /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/lsinv/lsinventory2011-08-18_13-58-55PM.txt
--------------------------------------------------------------------------------
Installed Top-level Products (1):
Oracle Database 11g 11.2.0.2.0
There are 1 products installed in this Oracle Home.
There are no Interim patches installed in this Oracle Home.
Rac system comprising of multiple nodes
Local node = hostname1
Remote node = hostname2
--------------------------------------------------------------------------------
OPatch succeeded.
oracle@hostname1.domain:[RPPROD]:/home/oracle >
$
**** Here is the message when using srvctl to try starting the instance
NB1. the RPPROD2 alert log appears to be fine (and the node is still up and running)
NB2. the +ASM1 alert log simply states that the RPROD1 client exited/diconnected
NB3. i could not see any strange messages pertaining to ASM in the crsd log wheni grepped it
oracle@hostname1.domain:[RPPROD]:/home/oracle >
$ srvctl start instance -d RPPROD -i RPPROD1
PRCR-1013 : Failed to start resource ora.rpprod.db
PRCR-1064 : Failed to start resource ora.rpprod.db on node hostname1
CRS-5017: The resource action "ora.rpprod.db start" encountered the following error:
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA01/RPPROD/spfileRPPROD.ora'
ORA-17503: ksfdopn:2 Failed to open file +DATA01/RPPROD/spfileRPPROD.ora
ORA-01034: ORACLE not available
ORA-27123: unable to attach to shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 4653061
Additional information: 10
CRS-2674: Start of 'ora.rpprod.db' on 'hostname1' failed
oracle@hostname1.domain:[RPPROD]:/home/oracle >
$
*** targets online state offline for hostname1
$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCHIVE.dg
ONLINE ONLINE hostname1
ONLINE ONLINE hostname2
ora.DATA01.dg
ONLINE ONLINE hostname1
ONLINE ONLINE hostname2
ora.DATA02.dg
ONLINE ONLINE hostname1
ONLINE ONLINE hostname2
ora.LISTENER.lsnr
ONLINE ONLINE hostname1
ONLINE ONLINE hostname2
ora.OCR_VOTE.dg
ONLINE ONLINE hostname1
ONLINE ONLINE hostname2
ora.asm
ONLINE ONLINE hostname1 Started
ONLINE ONLINE hostname2 Started
ora.gsd
OFFLINE OFFLINE hostname1
OFFLINE OFFLINE hostname2
ora.net1.network
ONLINE ONLINE hostname1
ONLINE ONLINE hostname2
ora.ons
ONLINE ONLINE hostname1
ONLINE ONLINE hostname2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE hostname2
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE hostname1
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE hostname1
ora.hostname1.vip
1 ONLINE ONLINE hostname1
ora.hostname2.vip
1 ONLINE ONLINE hostname2
ora.cvu
1 ONLINE ONLINE hostname1
ora.oc4j
1 ONLINE ONLINE hostname1
ora.rpprod.db
1 ONLINE OFFLINE
2 ONLINE ONLINE hostname2 Open
ora.scan1.vip
1 ONLINE ONLINE hostname2
ora.scan2.vip
1 ONLINE ONLINE hostname1
ora.scan3.vip
1 ONLINE ONLINE hostname1
grid@hostname1.domain:[+ASM1]:/home/grid >
$
R's V