Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Clusterware won't start after PSU update

2745406Mar 7 2016 — edited Mar 8 2016

Hello together,

I've a three noded cluster, all three nodes running under Oracle Linux 6 and Oracle Grid Infrastructure 11.2.0.4. At the weekend we had a maintenance window because of the glibc bug which is actually in the news.

Because of the high availability of the databases (over 80 databases, all single instances but we can stop them on the first node and switch them to another) I've switched all running databases on the node qum148 to the other two nodes, stopped the Clusterware and installed the OS updates over yum. Then I've done a restart of that node. After the reboot was fine I've started the clusterware with "crsctl start crs" over root. After about 10 minutes everything was running.

Now we thaught "when we have a maintenance window lets install the newest PSU Patch for the GI and the database homes also". So I've started to generate a responce file over the GI Structure so that we can install Patch 22191577 GI PSU fior January 2016):

$GRID_HOME/OPatch/ocm/bin/emocmrsp  -no_banner -output /tmp/quim148.rsp

With that file I've started the patching of the GI as root with:

$GRID_HOME/OPatch/opatch auto /path/to/patch/22191577 -ocmrf /tmp/qum148.rsp

The patching tooks about 30 minutes and everything was fine. So I've restarted the node completely again. After it was back, I've tried to start the crs as root. But now my processes didn't come up:

$ crsctl start crs

CRS-4123: Oracle High Availability Services has been started.

root@QUM148 - CRS_11_2_0_4:~

$ crsctl stat res -init -t

--------------------------------------------------------------------------------

NAME           TARGET  STATE        SERVER                   STATE_DETAILS

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.asm

      1        ONLINE  OFFLINE

ora.cluster_interconnect.haip

      1        ONLINE  OFFLINE

ora.crf

      1        ONLINE  OFFLINE

ora.crsd

      1        ONLINE  OFFLINE

ora.cssd

      1        ONLINE  OFFLINE

ora.cssdmonitor

      1        OFFLINE OFFLINE

ora.ctssd

      1        ONLINE  OFFLINE

ora.diskmon

      1        OFFLINE OFFLINE

ora.evmd

      1        ONLINE  OFFLINE

ora.gipcd

      1        ONLINE  OFFLINE

ora.gpnpd

      1        ONLINE  UNKNOWN      qum148

ora.mdnsd

      1        ONLINE  UNKNOWN      qum148

root@QUM148 - CRS_11_2_0_4:/usr/local/grid/11.2.0.4/log/qum148

$

This status won't change. But why?

So first look into the alert.log $GRID_HOME/log/qum148/alertqum148.log. Here every 3-5 seconds I get:

[client(23784)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).

2016-03-07 14:32:44.098:

[client(23784)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /usr/local/grid/11.2.0.4/log/qum148/client/crsctl_oraoma.log.

And not more. So next look into /usr/local/grid/11.2.0.4/log/qum148/client/crsctl_oraoma.log. I'm wandering about the filename, because "oraoma" is also the user, with which we're running the Enterprise Manager Cloud Control agent but that nearby.

So what is crsctl_oraoma.log talking:

2016-03-07 14:32:40.679: [  OCRASM][1709102880]ASM Error Stack : ORA-29701: unable to connect to Cluster Synchronization Service

2016-03-07 14:32:40.681: [  OCRASM][1709102880]proprasmo: kgfoCheckMount returned [7]

2016-03-07 14:32:40.681: [  OCRASM][1709102880]proprasmo: The ASM instance is down

2016-03-07 14:32:40.883: [  OCRRAW][1709102880]proprioo: Failed to open [+CRS]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.

2016-03-07 14:32:40.884: [  OCRRAW][1709102880]proprioo: No OCR/OLR devices are usable

2016-03-07 14:32:40.884: [  OCRASM][1709102880]proprasmcl: asmhandle is NULL

2016-03-07 14:32:40.884: [  OCRRAW][1709102880]proprinit: Could not open raw device

2016-03-07 14:32:40.884: [  OCRASM][1709102880]proprasmcl: asmhandle is NULL

2016-03-07 14:32:40.884: [ default][1709102880]a_init:7!: Backend init unsuccessful : [26]

2016-03-07 14:32:41.046: [  OCRMSG][2316560160]prom_waitconnect: CONN NOT ESTABLISHED (0,29,1,2)

2016-03-07 14:32:41.046: [  OCRMSG][2316560160]GIPC error [29] msg [gipcretConnectionRefused]

2016-03-07 14:32:41.046: [  OCRMSG][2316560160]prom_connect: error while waiting for connection complete [24]

[   CLWAL][2316560160]clsw_Initialize: OLR initlevel [30000]

2016-03-07 14:32:44.096: [ default][2316560160]Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).

2016-03-07 14:32:44.096: [  OCRASM][2316560160]proprasmo: Error [13] in opening the GPNP profile. Try to get offline profile

2016-03-07 14:32:44.096: [    GPNP][2316560160]clsgpnp_getOfflineProfile: [at clsgpnp.c:583] Result: (8) CLSGPNP_PERMS. Must be a privileged user to get an offline GPnP profile.

2016-03-07 14:32:44.096: [  OCRASM][2316560160]proprasmo: Error [8] in opening the GPNP offline profile.

2016-03-07 14:32:44.096: [  OCRASM][2316560160]proprasmo: Error in open/create file in dg [CRS]

[  OCRASM][2316560160]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=29701, loc=kgfokge

2016-03-07 14:32:44.096: [  OCRASM][2316560160]ASM Error Stack : ORA-29701: unable to connect to Cluster Synchronization Service

2016-03-07 14:32:44.098: [  OCRASM][2316560160]proprasmo: kgfoCheckMount returned [7]

2016-03-07 14:32:44.098: [  OCRASM][2316560160]proprasmo: The ASM instance is down

2016-03-07 14:32:44.300: [  OCRRAW][2316560160]proprioo: Failed to open [+CRS]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.

2016-03-07 14:32:44.300: [  OCRRAW][2316560160]proprioo: No OCR/OLR devices are usable

2016-03-07 14:32:44.300: [  OCRASM][2316560160]proprasmcl: asmhandle is NULL

2016-03-07 14:32:44.300: [  OCRRAW][2316560160]proprinit: Could not open raw device

2016-03-07 14:32:44.300: [  OCRASM][2316560160]proprasmcl: asmhandle is NULL

2016-03-07 14:32:44.300: [ default][2316560160]a_init:7!: Backend init unsuccessful : [26]

OK, my +CRS ASM can not be mounted. But here I'm at the end. I'm analyzing and googleling since about 12 hours, but I didn't get, why he can't mount the +CRS ASM and with that he can't start the Clusterware.

Can you assist please?

Thanks and regards,

David

This post has been answered by 2745406 on Mar 8 2016
Jump to Answer
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Apr 5 2016
Added on Mar 7 2016
5 comments
2,683 views