Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

RAC nodes does not start after server reboot

727876May 20 2011 — edited Oct 11 2013
Hi everyone,

this morning all the switches in our server room rebooted causing all the RAC servers to restart.
After this none of them would start successfully.

Oracle 11.2.0.1 on RHEL6

Here are some log info:
--------
crsd.log
--------
2011-05-20 12:13:10.782: [ CSSCLNT][2146903840]clssscConnect: gipc request failed with 29 (0x16)
2011-05-20 12:13:10.782: [ CSSCLNT][2146903840]clsssInitNative: connect failed, rc 29
2011-05-20 12:13:10.783: [  CRSRTI][2146903840] CSS is not ready. Received status 3 from CSS. Waiting for good status ..


----------------
alertstgrac1.log
----------------
[ohasd(2303)]CRS-2765:Resource 'ora.cssdmonitor' has failed on server 'stgrac1'.
2011-05-20 12:11:15.452
[cssd(2661)]CRS-1713:CSSD daemon is started in clustered mode
2011-05-20 12:11:15.739
[cssd(2661)]CRS-1603:CSSD on node stgrac1 shutdown by user.
2011-05-20 12:12:14.033
[/u01/app/11.2.0/grid/bin/orarootagent.bin(2563)]CRS-5818:Aborted command 'start for resource: ora.diskmon 1 1' for resource 'ora.diskmon'. Details at (:CRSAGF00113:) in /u01/app/11.2.0/grid/log/stgrac1/agent/ohasd/orarootagent_root/orarootagent_root.log.
2011-05-20 12:12:18.039
[ohasd(2303)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.diskmon'. Details at (:CRSPE00111:) in /u01/app/11.2.0/grid/log/stgrac1/ohasd/ohasd.log


---------------------
orarootagent_root.log
---------------------
2011-05-20 12:12:23.162: [ora.diskmon][2684352256] [clean] execCmd ret = 0
2011-05-20 12:12:23.162: [ora.diskmon][2684352256] [clean] DiskmonAgent::clean } nopipe
2011-05-20 12:12:23.163: [ora.diskmon][2684352256] [clean] clsn_agent::clean }
2011-05-20 12:12:23.163: [    AGFW][2684352256] Command: clean for resource: ora.diskmon 1 1 completed with status: SUCCESS
2011-05-20 12:12:23.163: [    AGFW][2684352256] Executing command: check for resource: ora.diskmon 1 1
2011-05-20 12:12:23.164: [    AGFW][3066025728] Agent sending reply for: RESOURCE_CLEAN[ora.diskmon 1 1] ID 4100:826
2011-05-20 12:12:23.164: [ora.diskmon][2684352256] [check] DiskmonAgent::check {
2011-05-20 12:12:23.164: [ora.diskmon][2684352256] [check] DiskmonAgent::connect {
2011-05-20 12:12:23.165: [ora.diskmon][2684352256] [check] DiskmonAgent::connect: skgznp_connect failed with error 56815 and the timeout expired
2011-05-20 12:12:23.165: [ora.diskmon][2684352256] [check] (null) category: 56815, operation: connect, loc: skgznpcon6, OS error: 2, other:
2011-05-20 12:12:23.165: [ora.diskmon][2684352256] [check] DiskmonAgent::connect } error
2011-05-20 12:12:23.165: [ora.diskmon][2684352256] [check] DiskmonAgent::check } 2
2011-05-20 12:12:23.165: [    AGFW][2684352256] check for resource: ora.diskmon 1 1 completed with status: PLANNED_OFFLINE
2011-05-20 12:12:23.165: [    AGFW][3066025728] ora.diskmon 1 1 state changed from: CLEANING to: PLANNED_OFFLINE
2011-05-20 12:12:23.166: [    AGFW][3066025728] Agent sending last reply for: RESOURCE_CLEAN[ora.diskmon 1 1] ID 4100:826


---------
ohasd.log
---------
2011-05-20 12:12:23.167: [    AGFW][2053089024] Agfw Proxy Server sending the reply to PE for message:RESOURCE_CLEAN[ora.diskmon 1 1] ID 4100:825
2011-05-20 12:12:23.167: [   CRSPE][2042582784] Received reply to action [Clean] message ID: 825
2011-05-20 12:12:23.168: [    AGFW][2053089024] Received the reply to the message: RESOURCE_CLEAN[ora.diskmon 1 1] ID 4100:826 from the agent /u01/app/11.2.0/grid/bin/orarootagent_root
2011-05-20 12:12:23.168: [    AGFW][2053089024] Agfw Proxy Server sending the last reply to PE for message:RESOURCE_CLEAN[ora.diskmon 1 1] ID 4100:825
2011-05-20 12:12:23.169: [   CRSPE][2042582784] Received reply to action [Clean] message ID: 825
2011-05-20 12:12:23.169: [   CRSPE][2042582784] RI [ora.diskmon 1 1] new external state [OFFLINE] old value: [UNKNOWN] label = []
2011-05-20 12:12:23.169: [   CRSPE][2042582784] CRS-2681: Clean of 'ora.diskmon' on 'stgrac1' succeeded

2011-05-20 12:12:23.169: [   CRSPE][2042582784] Sequencer for [ora.diskmon 1 1] has completed with error: CRS-0215: Could not start resource 'ora.diskmon'.


./crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE
ora.crsd
      1        ONLINE  INTERMEDIATE stgrac1
ora.cssd
      1        ONLINE  OFFLINE
ora.cssdmonitor
      1        ONLINE  ONLINE       stgrac1
ora.ctssd
      1        ONLINE  OFFLINE
ora.diskmon
      1        ONLINE  OFFLINE
ora.evmd
      1        ONLINE  ONLINE       stgrac1
ora.gipcd
      1        ONLINE  ONLINE       stgrac1
ora.gpnpd
      1        ONLINE  ONLINE       stgrac1
ora.mdnsd
      1        ONLINE  ONLINE       stgrac1
This errors look the same for 2 different RAC clusters(2 nodes per cluster).

Can anybody please give me some ideas on what I can check further?
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Nov 8 2013
Added on May 20 2011
37 comments
67,945 views