11gR2 post addNode problems
AlecselJan 24 2012 — edited Feb 3 2012Hi,
I'm trying to add the third node to my existing two node cluster. In fact i succeded adding the node but i have the following problem:
-when i start node 1 and 2, everything is ok, but on node 3 startup one of the other nodes fails
-this happens in any combination: start 1 and 3, everything is ok. when i start node 2, node 3 fails :)
The asm configuration (disks, permissions, etc) is the same on all three nodes.
Please find below the asm and crsd log details from the failure time:
alert_ASM.log
NOTE: enlarging ACD for group 1/0x124851d1 (DATA)
WARNNING: cache read a corrupted block group=DATA dsk=4 blk=1 from disk 4
NOTE: a corrupted block from group DATA was dumped to /u01/app/oracle/diag/asm/asm/+ASM3/trace/+ASM3_rbal_4109.trc+
WARNNING: cache read(retry) a corrupted block group=DATA dsk=4 blk=1 from disk 4
ERROR: cache failed to read group=DATA dsk=4 blk=1 from disk(s): 4 DATA_0004
+ORA-15196: invalid ASM block header [kfc.c:23924] [endian_kfbh] [2147483652] [1] [0 != 1]+
+ORA-15196: invalid ASM block header [kfc.c:23908] [endian_kfbh] [2147483652] [1] [0 != 1]+
+ORA-15196: invalid ASM block header [kfc.c:23924] [endian_kfbh] [2147483652] [1] [0 != 1]+
+ORA-15196: invalid ASM block header [kfc.c:23908] [endian_kfbh] [2147483652] [1] [0 != 1]+
System State dumped to trace file /u01/app/oracle/diag/asm/asm/+ASM3/trace/+ASM3_rbal_4109.trc+
NOTE: AMDU dump of disk group DATA created at /u01/app/oracle/diag/asm/asm/+ASM3/trace+
NOTE: cache initiating offline of disk 4 group DATA
NOTE: process 4109 initiating offline of disk 4.3915948326 (DATA_0004) with mask 0x7e in group 1
WARNING: Disk DATA_0004 in mode 0x7f is now being taken offline
NOTE: initiating PST update: grp = 1, dsk = 4/0xe968a126, mode = 0x15
kfdp_updateDsk(): 4
Tue Jan 24 12:56:18 2012
kfdp_updateDskBg(): 4
ERROR: too many offline disks in PST (grp 1)
WARNING: Disk DATA_0004 in mode 0x7f offline aborted
Tue Jan 24 12:56:18 2012
NOTE: halting all I/Os to diskgroup DATA
NOTE: active pin 0x0x6d0748a0 found in RBAL
NOTE: active pin 0x0x6d0749b0 found in RBAL
NOTE: active pin 0x0x6d074680 found in RBAL
NOTE: active pin 0x0x6d074790 found in RBAL
NOTE: active pin 0x0x6d074ac0 found in RBAL
ERROR: ACD not enlarged for diskgroup 1/0x124851d1 (DATA)
ERROR: ORA-15130 thrown in RBAL for group number 1
Tue Jan 24 12:56:18 2012
SQL> alter diskgroup DATA dismount force /* ASM SERVER */
Errors in file /u01/app/oracle/diag/asm/asm/+ASM3/trace/+ASM3_rbal_4109.trc:+
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0004" may result in a data loss
Errors in file /u01/app/oracle/diag/asm/asm/+ASM3/trace/+ASM3_rbal_4109.trc:+
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0004" may result in a data loss
NOTE: cache dismounting (not clean) group 1/0x124851D1 (DATA)
Tue Jan 24 12:56:19 2012
NOTE: LGWR doing non-clean dismount of group 1 (DATA)
NOTE: LGWR sync ABA=15.538 last written ABA 15.538
kjbdomdet send to inst 1
detach from dom 1, sending detach message to inst 1
kjbdomdet send to inst 2
detach from dom 1, sending detach message to inst 2
List of instances:
+1 2 3+
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 10)
Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 1 invalid = TRUE
+1 GCS resources traversed, 0 cancelled+
Dirty Detach Reconfiguration complete
Tue Jan 24 12:56:19 2012
WARNING: dirty detached from domain 1
NOTE: cache dismounted group 1/0x124851D1 (DATA)
kfdp_dismount(): 5
kfdp_dismountBg(): 5
NOTE: De-assigning number (1,0) from disk (/dev/oracleasm/disks/DISK1)
NOTE: De-assigning number (1,1) from disk (/dev/oracleasm/disks/DISK2)
NOTE: De-assigning number (1,2) from disk (/dev/oracleasm/disks/DISK3)
NOTE: De-assigning number (1,3) from disk (/dev/oracleasm/disks/DISK4)
NOTE: De-assigning number (1,4) from disk (/dev/oracleasm/disks/DISK5)
SUCCESS: diskgroup DATA was dismounted
NOTE: cache deleting context for group DATA 1/306729425
SUCCESS: alter diskgroup DATA dismount force /* ASM SERVER */
ERROR: PST-initiated MANDATORY DISMOUNT of group DATA
Tue Jan 24 12:56:19 2012
NOTE: diskgroup resource ora.DATA.dg is offline
Tue Jan 24 12:56:19 2012
Errors in file /u01/app/oracle/diag/asm/asm/+ASM3/trace/+ASM3_ora_5094.trc:+
ORA-15078: ASM diskgroup was forcibly dismounted
crsd.log:
[UiServer][1433831744] Container [ Name: UI_STOP
ASYNC_TAG:
TextMessage[1]
CLIENT:
TextMessage[]
CLIENT_PRIMARY_GROUP:
TextMessage[oinstall]
EVENT_TAG:
TextMessage[1]
FILTER:
TextMessage[((NAME==ora.DATA.dg)&&(LAST_SERVER==rac3))USR_ORA_OPI=true]
FILTER_TAG:
TextMessage[1]
FORCE_TAG:
TextMessage[1]
LOCALE:
TextMessage[AMERICAN_AMERICA.AL32UTF8]
NO_WAIT_TAG:
TextMessage[1]
QUEUE_TAG:
TextMessage[1]
]