two node 12r2 rac and was working fine.
db: 121020 rac
node1 is got rebooted without stoping running clusterware on it, after the reboot clusterware is not getting started on two nodes.
The errors we could see in ocssd.trc -repeated lines
2016-08-23 04:25:37.203021 : CSSD:1944577792: clssnmvDHBValidateNCopy: node 1, host1node1, has a disk HB, but no network HB, DHB has rcfg 367383874, wrtcnt, 5697828, LATS 6947784, lastSeqNo 5697825, uniqueness 1471946647, timestamp 1471951537/22270864
2016-08-23 04:25:37.247771 : CSSD:3130889984: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization
2016-08-23 04:25:37.247780 : CSSD:3130889984: clsssc_CLSFAInit_CB: clsfa fencing not ready yet
2016-08-23 04:25:37.668805 : CSSD:3119855360: clssscWaitOnEventValue: after CmInfo State val 3, eval 1 waited 1000 with cvtimewait status 4294967186
2016-08-23 04:25:37.739951 :GIPCHALO:3108542208: gipchaLowerSendEstablish: sending establish message for node '0x7ff8740b3140 { host 'host1node1', haName '0856-76ab-12c8-ed9c', srcLuid aa264460-ba9acee3, dstLuid 00000000-00000000 numInf 0, sentRegister 0, localMonitor 0, baseStream 0x7ff8740a1060 type gipchaNodeType12001 (20), nodeIncarnation 92606464-fffc8668 incarnation 4 flags 0x100004}'
2016-08-23 04:25:38.160519 : CSSD:1933539072: clssnmRcfgMgrThread: Local Join
2016-08-23 04:25:38.160535 : CSSD:1933539072: clssnmLocalJoinEvent: begin on node(2), waittime 193000
2016-08-23 04:25:38.160543 : CSSD:1933539072: clssnmLocalJoinEvent: set curtime (6948744) for my node
2016-08-23 04:25:38.160547 : CSSD:1933539072: clssnmLocalJoinEvent: scanning 32 nodes
2016-08-23 04:25:38.160554 : CSSD:1933539072: clssnmLocalJoinEvent: Node host1node1, number 1, is in an existing cluster with disk state 3
2016-08-23 04:25:38.160574 : CSSD:1933539072: clssnmLocalJoinEvent: takeover aborted due to cluster member node found on disk
2016-08-23 04:25:38.165099 : CSSD:1935116032: clssnmSendingThread: Connection pending for node host1node1, number 1, flags 0x00000002
Clusterware active version on the cluster is [12.1.0.2.0]
Done separately or in combination:
killed gipcd.bin on both nodes,started cluster
restarted private network interfaces, rebooted server, enabled multicasing.
reconfigured eth1 (private nic)
ping command across cluster nodes works well..
profile.xmls are similar on both nodes. .
did followed mos, considerable blog entires..
sysadmin says there is nothing to be done from their side..
clusterware comes online only on the node where we run start cluster command, another node is not getting added to the cluster. we could start clusterware only one node at a time.
if we run crsctl start cluster on second node, it shows CRS-2674: Start of 'ora.cssd' on 'host2node2' failed and no clusterware getting started on it.
Can someone plz help me on this.