We are setting up a 2 node Cluster with the Latest Oracle Linux 7.2 and the latest OCFS2.
Kernel installed 3.8.13-118.14.1.el7uek.x86_64
ocfs2-tools.x86_64 1.8.6-7.el7
cluster.conf looks like this on the two nodes
node:
number = 0
cluster = ocfs2
ip_port = 7777
ip_address = 10.1.1.101
name = ciosds01
node:
number = 1
cluster = ocfs2
ip_port = 7777
ip_address = 10.1.1.102
name = ciosds02
cluster:
heartbeat_mode = global
node_count = 2
name = ocfs2
heartbeat:
cluster = ocfs2
region = C39417265C094AA39EE6DA622B748BEA
Here is command we used to prepare the OCFS LUN
mkfs.ocfs2 -F -b 4K -C 64K -N 4 -L ocfs2vol1 --cluster-name=ocfs2 --cluster-stack=o2cb --global-heartbeat /dev/scinid
mounted.ocfs2 -d
Device Stack Cluster F UUID Label
/dev/scinid o2cb ocfs2 G C39417265C094AA39EE6DA622B748BEA ocfs2vol1
blkid
/dev/scinid: LABEL="ocfs2vol1" UUID="c3941726-5c09-4aa3-9ee6-da622b748bea" TYPE="ocfs2"
While running O2CB configure the Global Heartbeat can't start
/sbin/o2cb.init configure
Configuring the O2CB driver.
This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on
boot. The current values will be shown in brackets ('[]'). Hitting
<ENTER> without typing an answer will keep that current value. Ctrl-C
will abort.
Load O2CB driver on boot (y/n) [y]:
Cluster stack backing O2CB [o2cb]:
Cluster to start on boot (Enter "none" to clear) [ocfs2]:
Specify heartbeat dead threshold (>=7) [31]:
Specify network idle timeout in ms (>=5000) [30000]:
Specify network keepalive delay in ms (>=1000) [2000]:
Specify network reconnect delay in ms (>=2000) [2000]:
Writing O2CB configuration: OK
checking debugfs...
Setting cluster stack "o2cb": OK
Registering O2CB cluster "ocfs2": OK
Setting O2CB cluster timeouts : OK
Starting global heartbeat for cluster "ocfs2": Failed
o2cb: Heartbeat region could not be found C39417265C094AA39EE6DA622B748BEA
Stopping global heartbeat on cluster "ocfs2": OK
SELINUX is disabled on both nodes
We tried the same from both nodes in the cluster, same negative result.
If we switch to use LOCAL heartbeat, the cluster comes online without problems.
Unfortunately we can't use LOCAL heartbeat because we plan to expand the cluster to have MANY OCFS2 mount-points across the servers.
In the past, we had scaling issues with LOCAL heartbeat...GLOBAL Heartbeat is definitely the way to go, if we could make it work!
Thank you for helping us to troubleshoot the problem.