VIP failover problem on RAC 11gR1
YadorSep 26 2011 — edited Dec 12 2011Hi,
I've installed a RAC 11gR1, patched 11.1.0.7, on RHEL on virtual machines, 2 nodes.
The cluster is running in a separate subnet.
I 'm doing some tests of HA.
During one of them, I encounter a situation that is problematic.
The scenario:
I simulate (because I run on VM) the loss of interconnect. The reaction of the cluster seems to be OK. One node (let's say the node 2) is evicted and reboot !
The VIP of the evicted node (2) is now "running" on the surviving node (checked with crs_stat -t).
A new network interface eth0:2 with virtual IP address of node 2 is created on node 1 (checked with ifconfig -a)
All seems OK.
When the evicted node is up again, he can't join the cluster because the interconnect is still down.
The problem I see is that the virtual IP address of node 2 is configured now on eth0:1 on node 2...... and on node 1 as eth0:2 !
I think it's not a normal situation, is it ?
Finally, I reconnect the interconnect and start the cluster on node 2. The node 2 joins the cluster normally, the VIP2 is running now on node 2.
BUT... I can't ping no more the VIP2 from outside the subnet.... like if there was no reverse arp to inform the gateway that the VIP2 has changed...
The only workaround that I found is to do a ifdown/ifup eth0:1 on node 2 or to clear the arp table of the gateway...
Could someone see if I missed something in my configuration or if I don't understand correctly the concept of VIP failover....
Thank you for your help.
Yann