Skip to Main Content

Infrastructure Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

OCFS2 Global Heartbeat Failed o2cb: Heartbeat region could not be found

3acc46a9-9200-44c2-8f0e-016079d40b5fNov 10 2016 — edited Nov 12 2016

We are setting up a 2 node Cluster with the Latest Oracle Linux 7.2 and the latest OCFS2.

Kernel installed 3.8.13-118.14.1.el7uek.x86_64

ocfs2-tools.x86_64 1.8.6-7.el7

cluster.conf looks like this on the two nodes

node:

        number = 0

        cluster = ocfs2

        ip_port = 7777

        ip_address = 10.1.1.101

        name = ciosds01

node:

        number = 1

        cluster = ocfs2

        ip_port = 7777

        ip_address = 10.1.1.102

        name = ciosds02

cluster:

        heartbeat_mode = global

        node_count = 2

        name = ocfs2

heartbeat:

        cluster = ocfs2

        region = C39417265C094AA39EE6DA622B748BEA

Here is command we used to prepare the OCFS LUN

mkfs.ocfs2 -F -b 4K -C 64K -N 4 -L ocfs2vol1 --cluster-name=ocfs2 --cluster-stack=o2cb --global-heartbeat /dev/scinid

mounted.ocfs2 -d

Device       Stack  Cluster  F  UUID                              Label

/dev/scinid  o2cb   ocfs2    G  C39417265C094AA39EE6DA622B748BEA  ocfs2vol1

blkid

/dev/scinid: LABEL="ocfs2vol1" UUID="c3941726-5c09-4aa3-9ee6-da622b748bea" TYPE="ocfs2"

While running O2CB configure the Global Heartbeat can't start

/sbin/o2cb.init configure

Configuring the O2CB driver.

This will configure the on-boot properties of the O2CB driver.

The following questions will determine whether the driver is loaded on

boot.  The current values will be shown in brackets ('[]').  Hitting

<ENTER> without typing an answer will keep that current value.  Ctrl-C

will abort.

Load O2CB driver on boot (y/n) [y]:

Cluster stack backing O2CB [o2cb]:

Cluster to start on boot (Enter "none" to clear) [ocfs2]:

Specify heartbeat dead threshold (>=7) [31]:

Specify network idle timeout in ms (>=5000) [30000]:

Specify network keepalive delay in ms (>=1000) [2000]:

Specify network reconnect delay in ms (>=2000) [2000]:

Writing O2CB configuration: OK

checking debugfs...

Setting cluster stack "o2cb": OK

Registering O2CB cluster "ocfs2": OK

Setting O2CB cluster timeouts : OK

Starting global heartbeat for cluster "ocfs2": Failed

o2cb: Heartbeat region could not be found C39417265C094AA39EE6DA622B748BEA

Stopping global heartbeat on cluster "ocfs2": OK

SELINUX is disabled on both nodes

We tried the same from both nodes in the cluster, same negative result.

If we switch to use LOCAL heartbeat, the cluster comes online without problems.

Unfortunately we can't use LOCAL heartbeat because we plan to expand the cluster to have MANY OCFS2 mount-points across the servers.

In the past, we had scaling issues with LOCAL heartbeat...GLOBAL Heartbeat is definitely the way to go, if we could make it work!

Thank you for helping us to troubleshoot the problem.

Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Dec 10 2016
Added on Nov 10 2016
16 comments
3,651 views