T3/SANbox-8/Qlogic 2340/V440 hangs on boot
807557Aug 7 2007 — edited Aug 8 2007We're working to install a cluster using 2 V440s with 2 QL2340 (SG-XPCI1FC-QL2) HBAs, 2 Sun SANBox 8 port switches, and a T3 partner pair.
We're experiencing system hangs at "Configuring devices" at boot, and when allowed to boot with the HBAs unplugged, the HBAs are visible to cfgadm -al (unconfigured), but as soon as they are plugged in, they disappear. The same behavior occurs with fcinfo.
At the ok prompt, a probe-scsi-all reveals both HBAs see all four LUNs on the T3 partner pair, but I can't boot the server with the HBAs plugged in.
To supply enough information for you to work with, here's what I've done so far, so tell me where I screwed it up, please.
I installed the HBAs, loaded Solaris 10 11/06, patched up to 125100-10, and plugged up to the switches. I set the switches to factory defaults using SANbox Manager, set the host ports to F, and the T3 ports to TL (not initiator) as shown in all the config manuals I could find. I also set one of the switch's domain IDs to 17 so they weren't the same.
I configured the T3s with a RAID5 volume and a RAID1 volume in each, all volumes sliced and mounted, and I verified they were correct by directly connecting from an HBA to the T3 master and saw all luns as /scsi_vhci/xxxx volumes.
This would lead me to believe that the HBAs and T3s are up and running fine, but when I plug into the switches, the hangs occur and I can't see the HBAs anymore. I enabled extended-logging in qlc.conf, and can see errors in /var/adm/messages. I have to wait for the machine to come back up to reference them, but if someone can help, I'll transcribe them here (machine's on a different network, so I have to hand-jam them in here).
Fixes attempted so far:
I applied patches 114873-05, 116930-07, 125166-05, and upgraded the firmware on the switches. I changed out the FCode on the SUNW cards for QLGC firmware and applied QLA2300 driver; that works straight to T3, but no MPxIO support, so back to SUNW 1.16 FCode and qlc driver.
I've changed out fiber between host and switch and swapped GBICs. I've installed different versions of SUWNsmgr2, but can't find SUNWsmgr anywhere for download (which may be a problem, as most manuals for the Switch-8 use it, and have windows and buttons SUNWsmgr2 doesn't).
I've RTFM (all the FMs; SANbox, T3, SAN, multipathing) about a half dozen times in the last week and haven't gotten anywhere, so any help is appreciated.
S/F
Jeff
ETA: I just plugged up the old 280R running Sol8 to the switches, booted single user, and could see all LUNs on the T3s. Switch ports are identical, so the TL <-> F port configuration appears to be good.
So what is it about these HBAs / Sol10 that I'm missing? Is Sol8 smarter than Sol10, or is Sol10 just smarter than me???
Message was edited by:
Jeff_Harwick