How to reboot a guest domain when hung and ldm stop-domain doesn't work
807567May 11 2007 — edited Sep 4 2007Hi, the configuration is as follows.
SF T1000 (32 threads/16gb) memory
Latest Firmware and the LDOM patch (-02) applied.
This is how the LDOMs are setup.
Instance CPUs Memory
Service domain 4 2g
ldom1 4 2g
ldom2 4 2g
ldom3 4 2g
ldom4 4 2g
ldom5 4 2g
ldom6 4 2g
ldom7 4 1.9g
All guest domains are running on disk-images on mirrored BE on service domain. Size around 7 gb and SUNWCXall installed.
However, I have had a few hangs, especially when working over the virtual switch on the domains.
At the moment ldom1 is totally hung. See below for info:
bash-3.00# ldm list-domain
Name State Flags Cons VCPU Memory Util Uptime
primary active -t-cv SP 4 2G 0.5% 1d 1h 17m
ldom1 active -t--- 5000 4 2G 25% 2h 14m
ldom2 active -t--- 5001 4 2G 0.2% 2h 35m
ldom3 active ----- 5002 4 2G 0.2% 47m
ldom4 active ----- 5003 4 2G 0.2% 1d 1h 10m
ldom5 active -t--- 5004 4 2G 0.3% 1d 1h 10m
ldom6 active -t--- 5005 4 2G 0.2% 1d 1h 10m
ldom7 active -t--- 5006 4 1900M 0.2% 7h 29m
bash-3.00#
bash-3.00# ldm stop-domain ldom1
LDom ldom1 stop notification failed
bash-3.00#
bash-3.00# telnet localhost 5000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Connecting to console "ldom1" in group "ldom1" ....
Press ~? for control options ..
<COMMENT: ~w sent!>
Warning: another user currently has write permission
to this console and forcibly removing him/her will terminate
any current write action and all work will be lost.
Would you like to continue?[y/n] y
< COMMENT: I don't get any response when hitting enter and ~# (break) doesn't seem to work....>
I cannot ssh to ldom1 since it appears to be dead!
Anyone know if I can send some sort of reset to this hung domain? How can I troubleshoot it?
Regards,
Daniel