11gr2 clusterware not able to startup
729036Apr 28 2012 — edited May 3 2012Hi Folks,
i am having the issue on 2 node RAC , alll installation went fine and after reboot of 2 nodes clusterware not able to come online
Oracle Clusterware active version on the cluster is [11.2.0.1.0]
grid:ORA-B-R-D-RAC3$ uname -a
Linux ORA-B-R-D-RAC3 2.6.18-238.5.1.el5 #1 SMP Mon Feb 21 05:52:39 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
grid:ORA-B-R-D-RAC3$
looks like some permissions issue after reboot. ASM not coming up so OCR/voting disk not accessible on ASM resulting in clusterware down.
i tried to start ASM manually too, but looks like some permissions as ASM instance is pointing to oracle_base /u01/app/oracle rather then /u01/app/grid
grid:ORA-B-R-D-RAC3$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.1.0 Production on Sat Apr 28 09:18:27 2012
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ORA-48141: error creating directory during ADR initialization [u01/app/oracle/diag/asm/+asm]
ORA-48189: OS command to create directory failed
Linux-x86_64 Error: 13: Permission denied
Additional information: 2
SQL>
Tried to create manually the pfile for ASM instance and tried starting the ASM instance with pfile but got this error below
grid:ORA-B-R-D-RAC3$ cat init+ASM1.ora
asm_diskgroups='OCR_VOTING'
#asm_diskstring='ORCL:*'
asm_power_limit=1
instance_type='asm'
large_pool_size=12M
remote_login_passwordfile='EXCLUSIVE'
grid:ORA-B-R-D-RAC3$
LMHB started with pid=12, OS id=8606
Thu Apr 26 20:51:56 2012
MMAN started with pid=13, OS id=8608
Thu Apr 26 20:51:56 2012
DBW0 started with pid=14, OS id=8610
Thu Apr 26 20:51:56 2012
LGWR started with pid=15, OS id=8612
Thu Apr 26 20:51:56 2012
CKPT started with pid=16, OS id=8614
Thu Apr 26 20:51:56 2012
SMON started with pid=17, OS id=8616
Thu Apr 26 20:51:56 2012
RBAL started with pid=18, OS id=8618
Thu Apr 26 20:51:57 2012
GMON started with pid=19, OS id=8620
Thu Apr 26 20:51:57 2012
MMON started with pid=20, OS id=8622
Thu Apr 26 20:51:57 2012
MMNL started with pid=21, OS id=8624
lmon registered with NM - instance number 1 (internal mem no 0)
Reconfiguration started (old inc 0, new inc 2)
ASM instance
List of instances:
1 (myinst: 1)
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Thu Apr 26 20:51:57 2012
LCK0 started with pid=22, OS id=8629
Thu Apr 26 20:51:58 2012
Shutting down instance (abort)
License high water mark = 2
USER (ospid: 8634): terminating the instance
Instance terminated by USER, pid = 8634
Thu Apr 26 20:51:58 2012
Instance shutdown complete
*grid:ORA-B-R-D-RAC3$ cat crsconfig_params
*# $Header: has/install/crsconfig/crsconfig_params.sbs /main/17 2009/03/11 20:18:23 dpham Exp $
#
# crsconfig.lib
#
# Copyright (c) 2000, 2009, Oracle and/or its affiliates.All rights reserved.
#
# NAME
# crsconfig_params.sbs - Installer variables required for root config
#
#
# crsconfig_params.sbs -
#
# ==========================================================
SILENT=false
ORACLE_OWNER=grid
ORA_DBA_GROUP=oinstall
ORA_ASM_GROUP=oinstall
LANGUAGE_ID=AMERICAN_AMERICA.AL32UTF8
ORACLE_HOME=/u01/app/11.2.0/grid
ORACLE_BASE=/u01/app/grid
JREDIR=/u01/app/11.2.0/grid/jdk/jre/
JLIBDIR=/u01/app/11.2.0/grid/jlib
NETCFGJAR_NAME=netcfg.jar
EWTJAR_NAME=ewt3.jar
JEWTJAR_NAME=jewt4.jar
SHAREJAR_NAME=share.jar
HELPJAR_NAME=help4.jar
EMBASEJAR_NAME=oemlt.jar
VNDR_CLUSTER=false
OCR_LOCATIONS=NO_VAL
CLUSTER_NAME=ORA-B-R-cluster
HOST_NAME_LIST=ORA-B-R-D-RAC3,ORA-B-R-D-RAC4
NODE_NAME_LIST=ORA-B-R-D-RAC3,ORA-B-R-D-RAC4
PRIVATE_NAME_LIST=
VOTING_DISKS=NO_VAL
#VF_DISCOVERY_STRING=%s_vfdiscoverystring%
ASM_UPGRADE=false
ASM_SPFILE=
ASM_DISK_GROUP=OCR_VOTING
ASM_DISCOVERY_STRING=
ASM_DISKS=ORCL:CRSDISK1,ORCL:CRSDISK2
ASM_REDUNDANCY=EXTERNAL
CRS_STORAGE_OPTION=1
CSS_LEASEDURATION=400
CRS_NODEVIPS='ORA-B-R-D-RAC3-VIP/255.255.254.0/eth0,ORA-B-R-D-RAC4-VIP/255.255.254.0/eth0'
NODELIST=ORA-B-R-D-RAC3,ORA-B-R-D-RAC4
NETWORKS="eth0"/10.193.58.0:public,"eth1"/192.168.1.0:cluster_interconnect
SCAN_NAME=ORA-B-R-D-RAC2-SCAN.sysdev.adroot.bmogc.net
SCAN_PORT=1521
GPNP_PA=
OCFS_CONFIG=
# GNS consts
GNS_CONF=false
GNS_ADDR_LIST=
GNS_DOMAIN_LIST=
GNS_ALLOW_NET_LIST=
GNS_DENY_NET_LIST=
GNS_DENY_ITF_LIST=
#### Required by OUI add node
NEW_HOST_NAME_LIST=
NEW_NODE_NAME_LIST=
NEW_PRIVATE_NAME_LIST=
NEW_NODEVIPS='ORA-B-R-D-RAC3-VIP/255.255.254.0/eth0,ORA-B-R-D-RAC4-VIP/255.255.254.0/eth0'
############### OCR constants
# GPNPCONFIGDIR is handled differently in dev (T_HAS_WORK for all)
# GPNPGCONFIGDIR in dev expands to T_HAS_WORK_GLOBAL
GPNPCONFIGDIR=$ORACLE_HOME
GPNPGCONFIGDIR=$ORACLE_HOME
OCRLOC=
OLRLOC=
OCRID=
CLUSTER_GUID=
CLSCFG_MISSCOUNT=
checked the ASM headers and ASM disks with KFED all are valid disks
crsd.log is as below
2012-04-26 20:04:08.505: [ GPnP][856851168]clsgpnp_InitCKProviders: [at clsgpnp0.c:3891] Init gpnp local security key provider 1 of 2: file wallet (LSKP-FSW) OK
2012-04-26 20:04:08.505: [ GPnP][856851168]clsgpnp_InitCKProviders: [at clsgpnp0.c:3897] Init gpnp local security key proveders 2 of 2: OLR wallet (LSKP-CLSW-OLR)
[ CLWAL][856851168]clsw_Initialize: OLR initlevel [30000]
2012-04-26 20:04:08.506: [ CRSMAIN][1093040448] Policy Engine is not initialized yet!
2012-04-26 20:04:08.515: [ GPnP][856851168]clsgpnp_InitCKProviders: [at clsgpnp0.c:3919] Init gpnp local security key provider 2 of 2: OLR wallet (LSKP-CLSW-OLR) OK
2012-04-26 20:04:08.516: [ GPnP][856851168]clsgpnp_getCK: [at clsgpnp0.c:1950] <Get gpnp security keys (wallet) for id:1,typ;7. (2 providers - fatal if all fail)
2012-04-26 20:04:08.516: [ GPnP][856851168]clsgpnpkwf_getWalletPath: [at clsgpnpkwf.c:498] req_id=1 ck_prov_id=1 wallet path: /u01/app/11.2.0/grid/gpnp/ora-b-r-d-rac3/wallets/peer/
2012-04-26 20:04:08.525: [ GPnP][856851168]clsgpnpwu_walletfopen: [at clsgpnpwu.c:494] Opened SSO wallet: '/u01/app/11.2.0/grid/gpnp/ora-b-r-d-rac3/wallets/peer/cwallet.sso'
2012-04-26 20:04:08.525: [ GPnP][856851168]clsgpnp_getCK: [at clsgpnp0.c:1965] Result: (0) CLSGPNP_OK. Get gpnp wallet - provider 1 of 2 (LSKP-FSW(1))
2012-04-26 20:04:08.525: [ GPnP][856851168]clsgpnp_getCK: [at clsgpnp0.c:1982] Got gpnp security keys (wallet).>
2012-04-26 20:04:08.528: [ GPnP][856851168]clsgpnp_getCK: [at clsgpnp0.c:1950] <Get gpnp security keys (wallet) for id:1,typ;4. (2 providers - fatal if all fail)
2012-04-26 20:04:08.528: [ GPnP][856851168]clsgpnpkwf_getWalletPath: [at clsgpnpkwf.c:498] req_id=1 ck_prov_id=1 wallet path: /u01/app/11.2.0/grid/gpnp/ora-b-r-d-rac3/wallets/peer/
2012-04-26 20:04:08.535: [ GPnP][856851168]clsgpnpwu_walletfopen: [at clsgpnpwu.c:494] Opened SSO wallet: '/u01/app/11.2.0/grid/gpnp/ora-b-r-d-rac3/wallets/peer/cwallet.sso'
2012-04-26 20:04:08.536: [ GPnP][856851168]clsgpnp_getCK: [at clsgpnp0.c:1965] Result: (0) CLSGPNP_OK. Get gpnp wallet - provider 1 of 2 (LSKP-FSW(1))
2012-04-26 20:04:08.536: [ GPnP][856851168]clsgpnp_getCK: [at clsgpnp0.c:1982] Got gpnp security keys (wallet).>
2012-04-26 20:04:08.536: [ GPnP][856851168]clsgpnp_Init: [at clsgpnp0.c:837] GPnP client pid=5077, tl=3, f=0
2012-04-26 20:04:08.552: [GIPCXCPT][856851168] gipcShutdownF: skipping shutdown, count 2, from [ clsinet.c : 1732], ret gipcretSuccess (0)
2012-04-26 20:04:08.554: [GIPCXCPT][856851168] gipcShutdownF: skipping shutdown, count 1, from [ clsgpnp0.c : 1021], ret gipcretSuccess (0)
2012-04-26 20:04:08.563: [ OCRASM][856851168]proprasmo: Error in open/create file in dg [OCR_VOTING]
[ OCRASM][856851168]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge
ORA-15077: could not locate ASM instance serving a required diskgroup
2012-04-26 20:04:08.565: [ OCRASM][856851168]proprasmo: kgfoCheckMount returned [7]
2012-04-26 20:04:08.565: [ OCRASM][856851168]proprasmo: The ASM instance is down
2012-04-26 20:04:08.565: [ OCRRAW][856851168]proprioo: Failed to open [+OCR_VOTING]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2012-04-26 20:04:08.565: [ OCRRAW][856851168]proprioo: No OCR/OLR devices are usable
2012-04-26 20:04:08.565: [ OCRASM][856851168]proprasmcl: asmhandle is NULL
2012-04-26 20:04:08.565: [ OCRRAW][856851168]proprinit: Could not open raw device
2012-04-26 20:04:08.565: [ OCRASM][856851168]proprasmcl: asmhandle is NULL
2012-04-26 20:04:08.565: [ OCRAPI][856851168]a_init:16!: Backend init unsuccessful : [26]
2012-04-26 20:04:08.565: [ CRSOCR][856851168] OCR context init failure. Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge
ORA-15077: could not locate ASM instance serving a required diskgroup
] [7]
2012-04-26 20:04:08.565: [ CRSD][856851168][PANIC] CRSD exiting: Could not init OCR, code: 26
2012-04-26 20:04:08.566: [ CRSD][856851168] Done.