RAC and ASM issue
745540Mar 27 2011 — edited Mar 27 2011Hi All,
i am getting an error message in asm_alert log file saying "NOTE: ASMB process exiting due to lack of ASM file activity".
This leads to frequent crashing of NODE 1. Please check below detail error and suggest solution.
Thu Mar 24 07:05:11 2011
LMD0 (ospid: 32493) has not called a wait for 94 secs.
GES: System Load is HIGH.
GES: Current load is 55.87 and high load threshold is 20.00
Thu Mar 24 07:06:32 2011
LMD0 (ospid: 32493) has not called a wait for 174 secs.
GES: System Load is HIGH.
GES: Current load is 71.23 and high load threshold is 20.00
Thu Mar 24 07:06:36 2011
+Trace dumping is performing id=[cdmp_20110324070635]+
Thu Mar 24 07:07:49 2011
+Trace dumping is performing id=[cdmp_20110324070635]+
Thu Mar 24 07:08:16 2011
Waiting for clusterware split-brain resolution
Thu Mar 24 07:18:17 2011
Errors in file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_lmon_32484.trc (incident=60073):+
ORA-29740: evicted by member 1, group incarnation 120
Incident details in: /u01/app/oracle/diag/asm/asm/+ASM1/incident/incdir_60073/+ASM1_lmon_32484_i60073.trc+
Thu Mar 24 07:18:19 2011
+Trace dumping is performing id=[cdmp_20110324071819]+
Errors in file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_lmon_32484.trc:+
ORA-29740: evicted by member 1, group incarnation 120
LMON (ospid: 32484): terminating the instance due to error 29740
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_diag_32459.trc+
+Trace dumping is performing id=[cdmp_20110324071820]+
Instance terminated by LMON, pid = 32484
Thu Mar 24 07:18:31 2011
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 172.20.223.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 172.20.222.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.1.0/asm_1/dbs/arch
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.7.0.
Using parameter settings in server-side pfile /u01/app/oracle/product/11.1.0/asm_1/dbs/initASM1.ora+
System parameters with non-default values:
large_pool_size = 12M
instance_type = "asm"
cluster_database = TRUE
instance_number = 1
asm_diskstring = "ORCL:*"
asm_diskgroups = "REDO01"
asm_diskgroups = "REDO02"
asm_diskgroups = "DATA"
asm_diskgroups = "RECOVERY"
diagnostic_dest = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
+172.20.223.25+
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Thu Mar 24 07:18:36 2011
PMON started with pid=2, OS id=23120
Thu Mar 24 07:18:36 2011
VKTM started with pid=3, OS id=23123 at elevated priority
VKTM running at (20)ms precision
Thu Mar 24 07:18:36 2011
DIAG started with pid=4, OS id=23127
Thu Mar 24 07:18:37 2011
PING started with pid=5, OS id=23129
Thu Mar 24 07:18:37 2011
PSP0 started with pid=6, OS id=23131
Thu Mar 24 07:18:37 2011
DIA0 started with pid=7, OS id=23133
Thu Mar 24 07:18:37 2011
LMON started with pid=8, OS id=23135
Thu Mar 24 07:18:37 2011
LMD0 started with pid=9, OS id=23137
Thu Mar 24 07:18:37 2011
LMS0 started with pid=10, OS id=23148 at elevated priority
Thu Mar 24 07:18:37 2011
MMAN started with pid=11, OS id=23152
Thu Mar 24 07:18:38 2011
DBW0 started with pid=12, OS id=23170
Thu Mar 24 07:18:38 2011
LGWR started with pid=13, OS id=23176
Thu Mar 24 07:18:38 2011
CKPT started with pid=14, OS id=23218
Thu Mar 24 07:18:38 2011
SMON started with pid=15, OS id=23224
Thu Mar 24 07:18:38 2011
RBAL started with pid=16, OS id=23237
Thu Mar 24 07:18:38 2011
GMON started with pid=17, OS id=23239
lmon registered with NM - instance id 1 (internal mem no 0)
Reconfiguration started (old inc 0, new inc 124)
ASM instance
List of nodes:
+0 1 2+
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* allocate domain 1, invalid = TRUE
* allocate domain 2, invalid = TRUE
* allocate domain 3, invalid = TRUE
* allocate domain 4, invalid = TRUE
* domain 0 valid = 1 according to instance 1
* domain 1 valid = 1 according to instance 1
* domain 2 valid = 1 according to instance 1
* domain 3 valid = 1 according to instance 1
* domain 4 valid = 1 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
LMS 0: 0 GCS shadows traversed, 0 replayed
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Thu Mar 24 07:18:40 2011
LCK0 started with pid=18, OS id=23277
ORACLE_BASE from environment = /u01/app/oracle
Thu Mar 24 07:18:41 2011
SQL> ALTER DISKGROUP ALL MOUNT
NOTE: cache registered group DATA number=1 incarn=0xf7063e39
NOTE: cache began mount (not first) of group DATA number=1 incarn=0xf7063e39
NOTE: cache registered group RECOVERY number=2 incarn=0xf7063e3a
NOTE: cache began mount (not first) of group RECOVERY number=2 incarn=0xf7063e3a
NOTE: cache registered group REDO01 number=3 incarn=0xf7163e3b
NOTE: cache began mount (not first) of group REDO01 number=3 incarn=0xf7163e3b
NOTE: cache registered group REDO02 number=4 incarn=0xf7163e3c
NOTE: cache began mount (not first) of group REDO02 number=4 incarn=0xf7163e3c
NOTE:Loaded lib: /opt/oracle/extapi/32/asm/orcl/1/libasm.so
NOTE: Assigning number (1,0) to disk (ORCL:ASM_DATA1)
NOTE: Assigning number (1,1) to disk (ORCL:ASM_DATA2)
NOTE: Assigning number (2,0) to disk (ORCL:ASM_RECO1)
NOTE: Assigning number (3,0) to disk (ORCL:ASM_LOG1)
NOTE: Assigning number (4,0) to disk (ORCL:ASM_LOG2)
kfdp_query(): 5
kfdp_queryBg(): 5
NOTE: cache opening disk 0 of grp 1: DATA1 label:ASM_DATA1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache opening disk 1 of grp 1: DATA2 label:ASM_DATA2
NOTE: cache mounting (not first) group 1/0xF7063E39 (DATA)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 1
NOTE: LGWR attempting to mount thread 1 for diskgroup 1
NOTE: LGWR mounted thread 1 for disk group 1
NOTE: opening chunk 1 at fcn 0.10794571 ABA
NOTE: seq=81 blk=1313
NOTE: cache mounting group 1/0xF7063E39 (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=1 incarn=0xf7063e39
kfdp_query(): 6
kfdp_queryBg(): 6
NOTE: cache opening disk 0 of grp 2: RECO1 label:ASM_RECO1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 2/0xF7063E3A (RECOVERY)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 2
NOTE: LGWR attempting to mount thread 1 for diskgroup 2
NOTE: LGWR mounted thread 1 for disk group 2
NOTE: opening chunk 1 at fcn 0.10436377 ABA
NOTE: seq=48 blk=4298
NOTE: cache mounting group 2/0xF7063E3A (RECOVERY) succeeded
NOTE: cache ending mount (success) of group RECOVERY number=2 incarn=0xf7063e3a
kfdp_query(): 7
kfdp_queryBg(): 7
NOTE: cache opening disk 0 of grp 3: LOG1 label:ASM_LOG1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 3/0xF7163E3B (REDO01)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 3
NOTE: LGWR attempting to mount thread 1 for diskgroup 3
NOTE: LGWR mounted thread 1 for disk group 3
NOTE: opening chunk 1 at fcn 0.229332 ABA
NOTE: seq=30 blk=10690
NOTE: cache mounting group 3/0xF7163E3B (REDO01) succeeded
NOTE: cache ending mount (success) of group REDO01 number=3 incarn=0xf7163e3b
kfdp_query(): 8
kfdp_queryBg(): 8
NOTE: cache opening disk 0 of grp 4: LOG2 label:ASM_LOG2
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 4/0xF7163E3C (REDO02)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 4
NOTE: LGWR attempting to mount thread 1 for diskgroup 4
NOTE: LGWR mounted thread 1 for disk group 4
NOTE: opening chunk 1 at fcn 0.225880 ABA
NOTE: seq=30 blk=10556
NOTE: cache mounting group 4/0xF7163E3C (REDO02) succeeded
NOTE: cache ending mount (success) of group REDO02 number=4 incarn=0xf7163e3c
kfdp_query(): 9
kfdp_queryBg(): 9
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 1
SUCCESS: diskgroup DATA was mounted
kfdp_query(): 10
kfdp_queryBg(): 10
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 2
SUCCESS: diskgroup RECOVERY was mounted
kfdp_query(): 11
kfdp_queryBg(): 11
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 3
SUCCESS: diskgroup REDO01 was mounted
kfdp_query(): 12
kfdp_queryBg(): 12
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 4
SUCCESS: diskgroup REDO02 was mounted
SUCCESS: ALTER DISKGROUP ALL MOUNT
Thu Mar 24 08:26:28 2011
Starting background process ASMB
Thu Mar 24 08:26:28 2011
ASMB started with pid=20, OS id=9597
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
Thu Mar 24 08:27:39 2011
Starting background process ASMB
Thu Mar 24 08:27:39 2011
ASMB started with pid=25, OS id=10735
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
+[oracle@qa1crmrac1 trace]$ tail -1500 alert_ASM1.log
Thu Mar 24 07:05:11 2011
LMD0 (ospid: 32493) has not called a wait for 94 secs.
GES: System Load is HIGH.
GES: Current load is 55.87 and high load threshold is 20.00
Thu Mar 24 07:06:32 2011
LMD0 (ospid: 32493) has not called a wait for 174 secs.
GES: System Load is HIGH.
GES: Current load is 71.23 and high load threshold is 20.00
Thu Mar 24 07:06:36 2011
+Trace dumping is performing id=[cdmp_20110324070635]+
Thu Mar 24 07:07:49 2011
+Trace dumping is performing id=[cdmp_20110324070635]+
Thu Mar 24 07:08:16 2011
Waiting for clusterware split-brain resolution
Thu Mar 24 07:18:17 2011
Errors in file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_lmon_32484.trc (incident=60073):+
ORA-29740: evicted by member 1, group incarnation 120
Incident details in: /u01/app/oracle/diag/asm/asm/+ASM1/incident/incdir_60073/+ASM1_lmon_32484_i60073.trc+
Thu Mar 24 07:18:19 2011
+Trace dumping is performing id=[cdmp_20110324071819]+
Errors in file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_lmon_32484.trc:+
ORA-29740: evicted by member 1, group incarnation 120
LMON (ospid: 32484): terminating the instance due to error 29740
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/diag/asm/asm/+ASM1/trace/+ASM1_diag_32459.trc+
+Trace dumping is performing id=[cdmp_20110324071820]+
Instance terminated by LMON, pid = 32484
Thu Mar 24 07:18:31 2011
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 172.20.223.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 172.20.222.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.1.0/asm_1/dbs/arch
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.7.0.
Using parameter settings in server-side pfile /u01/app/oracle/product/11.1.0/asm_1/dbs/initASM1.ora+
System parameters with non-default values:
large_pool_size = 12M
instance_type = "asm"
cluster_database = TRUE
instance_number = 1
asm_diskstring = "ORCL:*"
asm_diskgroups = "REDO01"
asm_diskgroups = "REDO02"
asm_diskgroups = "DATA"
asm_diskgroups = "RECOVERY"
diagnostic_dest = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
+172.20.223.25+
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Thu Mar 24 07:18:36 2011
PMON started with pid=2, OS id=23120
Thu Mar 24 07:18:36 2011
VKTM started with pid=3, OS id=23123 at elevated priority
VKTM running at (20)ms precision
Thu Mar 24 07:18:36 2011
DIAG started with pid=4, OS id=23127
Thu Mar 24 07:18:37 2011
PING started with pid=5, OS id=23129
Thu Mar 24 07:18:37 2011
PSP0 started with pid=6, OS id=23131
Thu Mar 24 07:18:37 2011
DIA0 started with pid=7, OS id=23133
Thu Mar 24 07:18:37 2011
LMON started with pid=8, OS id=23135
Thu Mar 24 07:18:37 2011
LMD0 started with pid=9, OS id=23137
Thu Mar 24 07:18:37 2011
LMS0 started with pid=10, OS id=23148 at elevated priority
Thu Mar 24 07:18:37 2011
MMAN started with pid=11, OS id=23152
Thu Mar 24 07:18:38 2011
DBW0 started with pid=12, OS id=23170
Thu Mar 24 07:18:38 2011
LGWR started with pid=13, OS id=23176
Thu Mar 24 07:18:38 2011
CKPT started with pid=14, OS id=23218
Thu Mar 24 07:18:38 2011
SMON started with pid=15, OS id=23224
Thu Mar 24 07:18:38 2011
RBAL started with pid=16, OS id=23237
Thu Mar 24 07:18:38 2011
GMON started with pid=17, OS id=23239
lmon registered with NM - instance id 1 (internal mem no 0)
Reconfiguration started (old inc 0, new inc 124)
ASM instance
List of nodes:
+0 1 2+
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* allocate domain 1, invalid = TRUE
* allocate domain 2, invalid = TRUE
* allocate domain 3, invalid = TRUE
* allocate domain 4, invalid = TRUE
* domain 0 valid = 1 according to instance 1
* domain 1 valid = 1 according to instance 1
* domain 2 valid = 1 according to instance 1
* domain 3 valid = 1 according to instance 1
* domain 4 valid = 1 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
LMS 0: 0 GCS shadows traversed, 0 replayed
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Thu Mar 24 07:18:40 2011
LCK0 started with pid=18, OS id=23277
ORACLE_BASE from environment = /u01/app/oracle
Thu Mar 24 07:18:41 2011
SQL> ALTER DISKGROUP ALL MOUNT
NOTE: cache registered group DATA number=1 incarn=0xf7063e39
NOTE: cache began mount (not first) of group DATA number=1 incarn=0xf7063e39
NOTE: cache registered group RECOVERY number=2 incarn=0xf7063e3a
NOTE: cache began mount (not first) of group RECOVERY number=2 incarn=0xf7063e3a
NOTE: cache registered group REDO01 number=3 incarn=0xf7163e3b
NOTE: cache began mount (not first) of group REDO01 number=3 incarn=0xf7163e3b
NOTE: cache registered group REDO02 number=4 incarn=0xf7163e3c
NOTE: cache began mount (not first) of group REDO02 number=4 incarn=0xf7163e3c
NOTE:Loaded lib: /opt/oracle/extapi/32/asm/orcl/1/libasm.so
NOTE: Assigning number (1,0) to disk (ORCL:ASM_DATA1)
NOTE: Assigning number (1,1) to disk (ORCL:ASM_DATA2)
NOTE: Assigning number (2,0) to disk (ORCL:ASM_RECO1)
NOTE: Assigning number (3,0) to disk (ORCL:ASM_LOG1)
NOTE: Assigning number (4,0) to disk (ORCL:ASM_LOG2)
kfdp_query(): 5
kfdp_queryBg(): 5
NOTE: cache opening disk 0 of grp 1: DATA1 label:ASM_DATA1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache opening disk 1 of grp 1: DATA2 label:ASM_DATA2
NOTE: cache mounting (not first) group 1/0xF7063E39 (DATA)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 1
NOTE: LGWR attempting to mount thread 1 for diskgroup 1
NOTE: LGWR mounted thread 1 for disk group 1
NOTE: opening chunk 1 at fcn 0.10794571 ABA
NOTE: seq=81 blk=1313
NOTE: cache mounting group 1/0xF7063E39 (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=1 incarn=0xf7063e39
kfdp_query(): 6
kfdp_queryBg(): 6
NOTE: cache opening disk 0 of grp 2: RECO1 label:ASM_RECO1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 2/0xF7063E3A (RECOVERY)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 2
NOTE: LGWR attempting to mount thread 1 for diskgroup 2
NOTE: LGWR mounted thread 1 for disk group 2
NOTE: opening chunk 1 at fcn 0.10436377 ABA
NOTE: seq=48 blk=4298
NOTE: cache mounting group 2/0xF7063E3A (RECOVERY) succeeded
NOTE: cache ending mount (success) of group RECOVERY number=2 incarn=0xf7063e3a
kfdp_query(): 7
kfdp_queryBg(): 7
NOTE: cache opening disk 0 of grp 3: LOG1 label:ASM_LOG1
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 3/0xF7163E3B (REDO01)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 3
NOTE: LGWR attempting to mount thread 1 for diskgroup 3
NOTE: LGWR mounted thread 1 for disk group 3
NOTE: opening chunk 1 at fcn 0.229332 ABA
NOTE: seq=30 blk=10690
NOTE: cache mounting group 3/0xF7163E3B (REDO01) succeeded
NOTE: cache ending mount (success) of group REDO01 number=3 incarn=0xf7163e3b
kfdp_query(): 8
kfdp_queryBg(): 8
NOTE: cache opening disk 0 of grp 4: LOG2 label:ASM_LOG2
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (not first) group 4/0xF7163E3C (REDO02)
kjbdomatt send to node 1
kjbdomatt send to node 2
NOTE: attached to recovery domain 4
NOTE: LGWR attempting to mount thread 1 for diskgroup 4
NOTE: LGWR mounted thread 1 for disk group 4
NOTE: opening chunk 1 at fcn 0.225880 ABA
NOTE: seq=30 blk=10556
NOTE: cache mounting group 4/0xF7163E3C (REDO02) succeeded
NOTE: cache ending mount (success) of group REDO02 number=4 incarn=0xf7163e3c
kfdp_query(): 9
kfdp_queryBg(): 9
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 1
SUCCESS: diskgroup DATA was mounted
kfdp_query(): 10
kfdp_queryBg(): 10
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 2
SUCCESS: diskgroup RECOVERY was mounted
kfdp_query(): 11
kfdp_queryBg(): 11
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 3
SUCCESS: diskgroup REDO01 was mounted
kfdp_query(): 12
kfdp_queryBg(): 12
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 4
SUCCESS: diskgroup REDO02 was mounted
SUCCESS: ALTER DISKGROUP ALL MOUNT
Thu Mar 24 08:26:28 2011
Starting background process ASMB
Thu Mar 24 08:26:28 2011
ASMB started with pid=20, OS id=9597
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
Thu Mar 24 08:27:39 2011
Starting background process ASMB
Thu Mar 24 08:27:39 2011
ASMB started with pid=25, OS id=10735
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
Do i need to set the compatible parameter?
Regards,
Vish