Clusterware resource won't attempt restart.
We have set up a 2 node cluster using Oracle Clusterware. This is configured to run as an active/passive cluster.
On Node 1 there is a resource group which contains:
rg1
rg1.head
rg1.vip
rg1.listener
rg1.db_db1
This seemed to work ok with the group relocating nicely following a reboot or crs_relocate -f rg1 command. Killing individual processes caused them to try and restart as they'd been set up to in their profiles.
A second database has been installed on the cluster. The database itself works ok on each node. It was added to the resource group as rg1.db_db2 resource (using the same act_db.pl script which came from here: http://www.oracle.com/technology/products/database/clusterware/pdf/SI_DB_Failover_11g.pdf)
Now the cluster is not working. Reboooting node 1 causes the resources to try and start on node 2 but the original rg1.db_db1 resource always fails the log showing this:
2009-07-09 16:22:35.832: [ CRSAPP][1552480576] StartResource error for rg1.db_db1 error code = 1
2009-07-09 16:22:36.202: [ CRSRES][1552480576] Start of `rg1.db_db1` on member `xenops-2` failed.
crs_profile -p rg1.db_db1
shows that the restart attempts are set to 5 but instead of trying again everything sits there until node1 comes back online then everything moves back there. (this is not straightforward as the rg1.db_db1 resource tends to fail there and it ping pongs a bit between the two until everyhting works). The placement parameter in the profile is set to balanced which I understand to mean the failed resource should try again on the node where the other resources are until the restart attempts are used up.
Any ideas?
thanks