Here's the story, I have some LUN's that are in a Sun cluster presented to two physical global zones (running Solaris 11.3). These are then allocated to a zone which is a cluster resource using the standard way in zone config files such as:
<device match="/dev/did/rdsk/d81s4"/>
And these are then used in an ASM storage system in the zone.
Now I've added new LUN's and want to remove the old ones. So I remove them from ASM, then remove them from the zone config file and run zoneadm apply, and in theory this should remove them from the zone.
# zoneadm -z dbserver1 apply
zone 'dbserver1': Checking: Removing device match=/dev/did/rdsk/d74s4
zone 'dbserver1': Checking: Removing device match=/dev/did/rdsk/d75s4
zone 'dbserver1': Checking: Removing device match=/dev/did/rdsk/d76s4
zone 'dbserver1': Checking: Removing device match=/dev/did/rdsk/d77s4
zone 'dbserver1': Applying the changes
I noticed that the zone still had the device files and I've seen this before, you have to just continue with the unpresenting of the storage to the host to eventually get rid of the device files. So that's what I do. I remove any SCSI locks on the LUN's with these commands:
a=/dev/rdsk/c0t60060E8006D02A000000D02A000000E8d0
/usr/cluster/lib/sc/scsi -c inkeys -d $a
/usr/cluster/lib/sc/scsi -c disfailfast -d $a
/usr/cluster/lib/sc/scsi -c release -d $a
/usr/cluster/lib/sc/scsi -c scrub -d $a
I then unpresent the storage from the SAN array and run these commands:
cfgadm -al -o show_SCSI_LUN | grep unusable
cfgadm -o unusable_SCSI_LUN -c unconfigure c2::50060e8005456500 #etc
devfsadm -v -C -c disk
cldevice clear
Now, on the other cluster node this worked perfectly, and devfsadm cleared all the device files. But on the cluster node where the zone is running it did not clear the device files. Furthermore, I got these errors in messages:
Nov 8 16:44:48 cbneuh05a Cluster.CCR: [ID 357511 daemon.warning] reservation warning(node_join) - MHIOCGRP_INKEYS error(5) will retry in 2 seconds
Nov 8 16:44:52 cbneuh05a last message repeated 11 times
Nov 8 16:44:54 cbneuh05a Cluster.CCR: [ID 826747 daemon.warning] reservation error(node_join) - do_scsi3_inkeys() error for disk /dev/did/rdsk/d77s2
Nov 8 16:44:54 cbneuh05a Cluster.CCR: [ID 826747 daemon.warning] reservation error(node_join) - do_scsi3_inkeys() error for disk /dev/did/rdsk/d75s2
Nov 8 16:44:54 cbneuh05a Cluster.CCR: [ID 826747 daemon.warning] reservation error(node_join) - do_scsi3_inkeys() error for disk /dev/did/rdsk/d76s2
Nov 8 16:44:54 cbneuh05a Cluster.CCR: [ID 826747 daemon.warning] reservation error(node_join) - do_scsi3_inkeys() error for disk /dev/did/rdsk/d74s2
Nov 8 16:46:04 cbneuh05a rcm_daemon[16126]: [ID 835212 daemon.error] IP: get_link_resource for clprivnet0 error(object not found)
Nov 8 16:46:04 cbneuh05a rcm_daemon[16126]: [ID 530554 daemon.error] IP: get_link_resource(clprivnet0) failed
So what am I doing wrong here? What do I need to do to safely remove these luns without leaving old device files sitting around? Much appreciate any help or suggestions. Thanks.
-Andrew