Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

CRS-4404: The following nodes did not reply within the allotted time- node is not getting added to t

knowledgespringAug 23 2016 — edited Sep 10 2016

two node 12r2 rac and was working fine.

db: 121020 rac

node1 is got rebooted without stoping running clusterware on it, after the reboot clusterware is not getting started on two nodes.

The errors we could see in ocssd.trc -repeated lines

2016-08-23 04:25:37.203021 : CSSD:1944577792: clssnmvDHBValidateNCopy: node 1, host1node1, has a disk HB, but no network HB, DHB has rcfg 367383874, wrtcnt, 5697828, LATS 6947784, lastSeqNo 5697825, uniqueness 1471946647, timestamp 1471951537/22270864

2016-08-23 04:25:37.247771 : CSSD:3130889984: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization

2016-08-23 04:25:37.247780 : CSSD:3130889984: clsssc_CLSFAInit_CB: clsfa fencing not ready yet

2016-08-23 04:25:37.668805 : CSSD:3119855360: clssscWaitOnEventValue: after CmInfo State val 3, eval 1 waited 1000 with cvtimewait status 4294967186

2016-08-23 04:25:37.739951 :GIPCHALO:3108542208: gipchaLowerSendEstablish: sending establish message for node '0x7ff8740b3140 { host 'host1node1', haName '0856-76ab-12c8-ed9c', srcLuid aa264460-ba9acee3, dstLuid 00000000-00000000 numInf 0, sentRegister 0, localMonitor 0, baseStream 0x7ff8740a1060 type gipchaNodeType12001 (20), nodeIncarnation 92606464-fffc8668 incarnation 4 flags 0x100004}'

2016-08-23 04:25:38.160519 : CSSD:1933539072: clssnmRcfgMgrThread: Local Join

2016-08-23 04:25:38.160535 : CSSD:1933539072: clssnmLocalJoinEvent: begin on node(2), waittime 193000

2016-08-23 04:25:38.160543 : CSSD:1933539072: clssnmLocalJoinEvent: set curtime (6948744) for my node

2016-08-23 04:25:38.160547 : CSSD:1933539072: clssnmLocalJoinEvent: scanning 32 nodes

2016-08-23 04:25:38.160554 : CSSD:1933539072: clssnmLocalJoinEvent: Node host1node1, number 1, is in an existing cluster with disk state 3

2016-08-23 04:25:38.160574 : CSSD:1933539072: clssnmLocalJoinEvent: takeover aborted due to cluster member node found on disk

2016-08-23 04:25:38.165099 : CSSD:1935116032: clssnmSendingThread: Connection pending for node host1node1, number 1, flags 0x00000002

Clusterware active version on the cluster is [12.1.0.2.0]

Done separately or in combination:

killed gipcd.bin on both nodes,started cluster

restarted private network interfaces, rebooted server, enabled multicasing.

reconfigured eth1 (private nic)

ping command across cluster nodes works well..

profile.xmls are similar on both nodes. .

did followed mos, considerable blog entires..

sysadmin says there is nothing to be done from their side..

clusterware comes online only on the node where we run start cluster command, another node is not getting added to the cluster. we could start clusterware only one node at a time.

if we run crsctl start cluster on second node, it shows CRS-2674: Start of 'ora.cssd' on 'host2node2' failed and no clusterware getting started on it.

Can someone plz help me on this.

This post has been answered by knowledgespring on Sep 1 2016

Jump to Answer

Locked Post

New comments cannot be posted to this locked post.

Locked on Sep 29 2016

Added on Aug 23 2016

#performance-availability, #real-application-clusters

6 comments

7,879 views