I am running Oracle 12.2.0.1.0. I am running ASM with Single Instance (not RAC).
Frequently, after a system reboot, I get the following when trying to mount disk groups.
ASMCMD> mount -a
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "15" is missing from group number "1"
ORA-15042: ASM disk "14" is missing from group number "1"
ORA-15042: ASM disk "13" is missing from group number "1"
ORA-15042: ASM disk "12" is missing from group number "1"
ORA-15042: ASM disk "11" is missing from group number "1"
ORA-15042: ASM disk "9" is missing from group number "1"
ORA-15042: ASM disk "7" is missing from group number "1"
ORA-15042: ASM disk "4" is missing from group number "1"
ORA-15017: diskgroup "REDO" cannot be mounted
ORA-15040: diskgroup is incomplete (DBD ERROR: OCIStmtExecute)
This seems to happen to all of the disk groups on the system. There are two. DATA and REDO.
I've looked at the disk labels with kfed. I have compared the disk label contained on a reported good disk, number 1, with a reported bad disk, number 15. I placed the output in two files and when I diffed them, this is what I got:
[oracle@aps41-20 tmp]$ diff data01 dta08
6,7c6,7
< kfbh.block.obj: 2147483648 ; 0x008: disk=0
< kfbh.check: 3287687001 ; 0x00c: 0xc3f61f59
---
> kfbh.block.obj: 2147483663 ; 0x008: disk=15
> kfbh.check: 3287686995 ; 0x00c: 0xc3f61f53
20c20
< kfdhdb.dsknum: 0 ; 0x024: 0x0000
---
> kfdhdb.dsknum: 15 ; 0x024: 0x000f
23c23
< kfdhdb.dskname: DATA_0000 ; 0x028: length=9
---
> kfdhdb.dskname: DATA_0015 ; 0x028: length=9
25c25
< kfdhdb.fgname: DATA_0000 ; 0x068: length=9
---
> kfdhdb.fgname: DATA_0015 ; 0x068: length=9
70c70
< kfdhdb.f1b1locn: 10 ; 0x0d4: 0x0000000a
---
> kfdhdb.f1b1locn: 0 ; 0x0d4: 0x00000000
120c120
< kfdhdb.ub4spare[16]: 0 ; 0x198: 0x00000000
---
> kfdhdb.ub4spare[16]: 470031366 ; 0x198: 0x1c041c06
The label seems like it could be OK. I will also note that with both disk number 1 and disk number 15, a kfed repair fails with the error:
KFED-00320: invalid block num1 = [3], num2 = [1], error = [type_kfbh]
Also, when, as root, I do an oracleasm listdisks, I get nothing in return. I would have at least expected to get the good disks returned in the result set.
This issue has happened 4 times in the past (this is the 5th time). Almost every time I've rebooted the server and at least once when I just restarted the database. In the past, I've recreated the database. But recreating the database takes a day and since this has happened multiple times, I'd really like to figure out what is going wrong so that I can correct it.
I have also looked on the Internet and on this forum, for this problem and I have not seen anything that addresses this issue.
Thank you for any help or pointers you can provide.
Best Regards,
BobA