rs-ora:resource group failed to start on chosen node; it may end up failing
807567Oct 14 2009 — edited Feb 1 2010I have configured two node failover cluster environment using netra a/d 1000 storage. When I try to deploy oracle server application it throws the following error
rs-ora: resource group failed to start on chosen node; it may end up failing over to other node(s)
I created metaset and gave one raw did disk to that metaset.
I created logical hostname resource, ha-storage plus resource. Later I brought the resource group to online using following command
#clrg online emM rg-ora
Later I created oracle cluster resource using following command.
#clrs create -g rg-ora -t SUNW.oracle_server -p ORACLE_HOME=/global/oracle/product/10.2.0/db_1 -p ORACLE_SID=infra -p Alert_log_file=/global/oracle/product/10.2.0/db_1/admin/infra/bdump/alert_infra.log -p Connect_string=sysdba/dbadmin1@infra -p Resource_dependencies=rs-ora-has rs-ora
node1 - Validation failed. ORACLE_HOME /global/oracle/product/10.2.0/db_1 does not exist
node1 - ALERT_LOG_FILE /global/oracle/product/10.2.0/db_1/admin/infra/bdump/alert_infra.log doesn't exist
node1 - PARAMETER_FILE: /global/oracle/product/10.2.0/db_1/dbs/initinfra.ora nor server PARAMETER_FILE: /global/oracle/product/10.2.0/db_1/dbs/spfileinfra.ora exists
node1 - This resource depends on a HAStoragePlus resouce that is not online on this node. Ignoring validation errors.
rs-ora: resource group failed to start on chosen node; it may end up failing over to other node(s)
The status of oracle resource shows as follows.
Resource Name Node Name State Status Message
rs-ora node1 Start failed Faulted
I used solaris 10 update 6 patch level is Generic_137137-09, Oracle version 10.2.0, Sun clusters 3.2 update1. Following are the vfstab and /var/adm/messages of both nodes.
Node1#grep ora /etc/vfstab
/dev/md/oradg/dsk/d300 /dev/md/oradg/rdsk/d300 /global/oracle ufs 5 no logging
Node2#grep ora /etc/vfstab
/dev/md/oradg/dsk/d300 /dev/md/oradg/rdsk/d300 /global/oracle ufs 5 no logging
Node1#more /var/adm/messages
Oct 17 05:19:17 node1 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_prenet_start> for resource <ha-
host-1>, resource group <rg-ora>, node <node1>, timeout <300> seconds
Oct 17 05:19:17 node1 Cluster.RGM.rgmd: [ID 751138 daemon.notice] 47 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/
lib/rgm/rt/hafoip/hafoip_prenet_start>:tag=<rg-ora.ha-host-1.10>: Calling security_clnt_connect(..., host=<node1>, sec_typ
e {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Oct 17 05:19:17 node1 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_prenet_start> completed successfully for
resource <ha-host-1>, resource group <rg-ora>, node <node1>, time used: 0% of timeout <300 seconds>
Oct 17 05:19:17 node1 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hastorageplus_prenet_start> for resour
ce <rs-ora-has>, resource group <rg-ora>, node <node1>, timeout <1800> seconds
Oct 17 05:19:17 node1 Cluster.RGM.rgmd: [ID 751138 daemon.notice] 47 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/
lib/rgm/rt/hastorageplus/hastorageplus_prenet_start>:tag=<rg-ora.rs-ora-has.10>: Calling security_clnt_connect(..., host=<tes
tlab5>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Oct 17 05:19:18 node1 Cluster.RGM.rgmd: [ID 375444 daemon.notice] 8 fe_rpc_command: cmd_type(enum):<2>:cmd=<null>:tag=<rg-
ora.rs-ora-has.10>: Calling security_clnt_connect(..., host=<node1>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<0>, ...)
Oct 17 05:19:18 node1 Cluster.RGM.rgmd: [ID 316625 daemon.notice] Timeout monitoring on method tag <rg-ora.rs-ora-has.10>
has been suspended.
Oct 17 05:19:20 node1 Cluster.Framework: [ID 801593 daemon.notice] stdout: becoming primary for oradg
Oct 17 05:19:21 node1 Cluster.RGM.rgmd: [ID 375444 daemon.notice] 8 fe_rpc_command: cmd_type(enum):<3>:cmd=<null>:tag=<rg-
ora.rs-ora-has.10>: Calling security_clnt_connect(..., host=<node1>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<0>, ...)
Oct 17 05:19:21 node1 Cluster.RGM.rgmd: [ID 316625 daemon.notice] Timeout monitoring on method tag <rg-ora.rs-ora-has.10>
has been resumed.
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hastorageplus_prenet_start> completed successful
ly for resource <rs-ora-has>, resource group <rg-ora>, node <node1>, time used: 0% of timeout <1800 seconds>
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_start> for resource <ha-host-1>
, resource group <rg-ora>, node <node1>, timeout <500> seconds
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 751138 daemon.notice] 47 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/
lib/rgm/rt/hafoip/hafoip_start>:tag=<rg-ora.ha-host-1.0>: Calling security_clnt_connect(..., host=<node1>, sec_type {0:WEA
K, 1:STRONG, 2:DES} =<1>, ...)
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_start> completed successfully for resourc
e <ha-host-1>, resource group <rg-ora>, node <node1>, time used: 0% of timeout <500 seconds>
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hafoip_monitor_start> for resource <ha
-host-1>, resource group <rg-ora>, node <node1>, timeout <300> seconds
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hastorageplus_start> for resource <rs-
ora-has>, resource group <rg-ora>, node <node1>, timeout <90> seconds
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 510020 daemon.notice] 46 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/
lib/rgm/rt/hafoip/hafoip_monitor_start>:tag=<rg-ora.ha-host-1.7>: Calling security_clnt_connect(..., host=<node1>, sec_typ
e {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 751138 daemon.notice] 47 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/
lib/rgm/rt/hastorageplus/hastorageplus_start>:tag=<rg-ora.rs-ora-has.0>: Calling security_clnt_connect(..., host=<node1>,
sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hafoip_monitor_start> completed successfully for
resource <ha-host-1>, resource group <rg-ora>, node <node1>, time used: 0% of timeout <300 seconds>
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hastorageplus_start> completed successfully for
resource <rs-ora-has>, resource group <rg-ora>, node <node1>, time used: 0% of timeout <90 seconds>
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hastorageplus_monitor_start> for resou
rce <rs-ora-has>, resource group <rg-ora>, node <node1>, timeout <90> seconds
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 751138 daemon.notice] 47 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/
lib/rgm/rt/hastorageplus/hastorageplus_monitor_start>:tag=<rg-ora.rs-ora-has.7>: Calling security_clnt_connect(..., host=<tes
tlab5>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Oct 17 05:19:25 node1 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <hastorageplus_monitor_start> completed successfu
lly for resource <rs-ora-has>, resource group <rg-ora>, node <node1>, time used: 0% of timeout <90 seconds>
Oct 17 05:19:38 node1 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <bin/oracle_server_validate> for resour
ce <rs-ora>, resource group <rg-ora>, node <node1>, timeout <120> seconds