Hi All,
Hi have a two node cluster in oracleVM 3.2.9.746 .
In one of the nodes when I tried to reboot the VM's the VM stays always is the stage of "STARTING".
In the logs I can see the following:
[2015-04-27 14:05:57 15984] DEBUG (common:43) dispatch function stop_vm to server https://oracle:******@172.28.20.184:8899/api/3
[2015-04-27 14:05:57 15985] DEBUG (service:74) call start: stop_vm('', '0004fb0000060000b0be178e3f573f68', True)
[2015-04-27 14:08:03 16101] ERROR (ha:34) Failed to get VM list on xxx.xx.xx.xxx: The read operation timed out
[2015-04-27 14:08:03 16104] ERROR (service:96) catch_error: Lock file /var/run/ovs-agent/vm-list.lock failed: timeout occured.
Traceback (most recent call last):
File "/usr/lib64/python2.4/site-packages/agent/lib/service.py", line 94, in wrapper
return func(*args)
File "/usr/lib64/python2.4/site-packages/agent/api/hypervisor/xenxm.py", line 246, in list_vms
return get_vms()
File "/usr/lib64/python2.4/site-packages/agent/lib/xenxm.py", line 127, in get_vms
lock.acquire(wait=120)
File "/usr/lib64/python2.4/site-packages/agent/lib/filelock.py", line 90, in acquire
raise LockError("Lock file %s failed: timeout occured." % self.filename)
LockError: Lock file /var/run/ovs-agent/vm-list.lock failed: timeout occured.
If I do a XM list in the node server of oracleVM it stays "frozen" I cannot get the output of the command.
This is only happen from time to time, doesn't happen every time I give a reboot.
Can anyone give some direction were to go, to try to resolve this issue. Any valid input will be appreciated.
Best Regards