Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

node 1 evicted due to ORA- 29740

686680Aug 26 2009 — edited Aug 26 2009
Hi Guys,

We have a two node cluster with 10.2.0.2 on Hp unix 11.31
Yesterday node 1 was evicted by the other node due to ORA 29740 error;
When I checked the alert log file I sae some IPC errors, below are some excerpts from the alert log files of both the nodes

Node 1 Alert log file
Mon Aug 24 22:03:00 2009
Thread 1 advanced to log sequence 10484
Current log# 7 seq# 10484 mem# 0: +DATADG/orcl/onlinelog/group_7.298.670427121
Mon Aug 24 22:03:00 2009
SUCCESS: diskgroup FLASHDG was mounted
SUCCESS: diskgroup FLASHDG was dismounted
Mon Aug 24 22:50:04 2009
IPC Send timeout detected. Receiver ospid 15041
Mon Aug 24 22:51:08 2009
*Trace dumping is performing id=[cdmp_20090824225031]*
Mon Aug 24 22:52:27 2009
Errors in file /u01/app/oracle/db/admin/orcl/bdump/orcl1_lmon_15039.trc:
ORA-29740: evicted by member 1, group incarnation 10
Mon Aug 24 22:52:27 2009
LMON: terminating instance due to error 29740
Mon Aug 24 22:52:27 2009
Errors in file /u01/app/oracle/db/admin/orcl/bdump/orcl1_lms1_15045.trc:
ORA-29740: evicted by member , group incarnation
Mon Aug 24 22:52:27 2009
Errors in file /u01/app/oracle/db/admin/orcl/bdump/orcl1_lms0_15043.trc:
ORA-29740: evicted by member , group incarnation
Mon Aug 24 22:52:30 2009
Errors in file /u01/app/oracle/db/admin/orcl/bdump/orcl1_rbal_15336.trc:
ORA-29740: evicted by member , group incarnation
Mon Aug 24 22:52:59 2009
Shutting down instance (abort)
License high water mark = 254
Mon Aug 24 22:53:02 2009
Instance terminated by LMON, pid = 15039
Mon Aug 24 22:53:04 2009
Instance terminated by USER, pid = 8745
Mon Aug 24 22:53:13 2009
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
-----------

Node 2 Alert log file

Mon Aug 24 19:55:31 2009
Thread 2 advanced to log sequence 6803
Current log# 10 seq# 6803 mem# 0: +DATADG/orcl/onlinelog/group_10.301.670427207
Mon Aug 24 19:55:31 2009
SUCCESS: diskgroup FLASHDG was mounted
SUCCESS: diskgroup FLASHDG was dismounted
Mon Aug 24 22:50:03 2009
IPC Send timeout detected.Sender: ospid 6382
Receiver: inst 1 binc 275179919 ospid 15041
Mon Aug 24 22:50:04 2009
IPC Send timeout detected.Sender: ospid 25897
Receiver: inst 1 binc 275179919 ospid 15041
Mon Aug 24 22:50:05 2009
IPC Send timeout detected.Sender: ospid 26617
Receiver: inst 1 binc 275179919 ospid 15041
Mon Aug 24 22:50:06 2009
IPC Send timeout detected.Sender: ospid 25678
Receiver: inst 1 binc 275179919 ospid 15041
Mon Aug 24 22:50:07 2009
IPC Send timeout detected.Sender: ospid 21344
Receiver: inst 1 binc 275179919 ospid 15041
Mon Aug 24 22:50:31 2009
IPC Send timeout to 0.0 inc 8 for msg type 12 from opid 198
Mon Aug 24 22:50:31 2009
Communications reconfiguration: instance_number 1+
Mon Aug 24 22:50:33 2009
IPC Send timeout to 0.0 inc 8 for msg type 12 from opid 112
Mon Aug 24 22:50:35 2009
Trace dumping is performing id=[cdmp_20090824225031]
Mon Aug 24 22:50:35 2009
IPC Send timeout detected.Sender: ospid 984
Receiver: inst 1 binc 275179919 ospid 15041
Mon Aug 24 22:50:35 2009
IPC Send timeout to 0.0 inc 8 for msg type 12 from opid 15
Mon Aug 24 22:50:49 2009
IPC Send timeout to 0.0 inc 8 for msg type 12 from opid 16
Mon Aug 24 22:50:52 2009
IPC Send timeout detected.Sender: ospid 12489
Receiver: inst 1 binc 275179919 ospid 15041
Mon Aug 24 22:50:57 2009
IPC Send timeout to 0.0 inc 8 for msg type 12 from opid 84
Mon Aug 24 22:51:00 2009
IPC Send timeout to 0.0 inc 8 for msg type 12 from opid 97
Mon Aug 24 22:51:07 2009
IPC Send timeout to 0.0 inc 8 for msg type 12 from opid 75
Mon Aug 24 22:51:08 2009
IPC Send timeout detected.Sender: ospid 8900
Receiver: inst 1 binc 275179919 ospid 15041
Mon Aug 24 22:51:25 2009
Receiver: inst 1 binc 275179919 ospid 15041
Mon Aug 24 22:52:09 2009
Mon Aug 24 22:52:42 2009
Waiting for instances to leave:
*1*
Mon Aug 24 22:52:57 2009
IPC Send timeout detected.Sender: ospid 6378
Receiver: inst 1 binc 275179919 ospid 15041
Mon Aug 24 22:53:02 2009
Reconfiguration started (old inc 8, new inc 12)
List of nodes:
1
Global Resource Directory frozen
* dead instance detected - domain 0 invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Mon Aug 24 22:53:02 2009
LMS 0: 10 GCS shadows cancelled, 2 closed
Mon Aug 24 22:53:02 2009
LMS 1: 1 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Mon Aug 24 22:53:04 2009
LMS 0: 317502 GCS shadows traversed, 0 replayed
Mon Aug 24 22:53:04 2009
LMS 1: 302589 GCS shadows traversed, 0 replayed
Mon Aug 24 22:53:04 2009
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Mon Aug 24 22:53:04 2009
Instance recovery: looking for dead threads
Mon Aug 24 22:53:04 2009
Beginning instance recovery of 1 threads
Reconfiguration complete
Mon Aug 24 22:53:06 2009
parallel recovery started with 3 processes
Mon Aug 24 22:53:07 2009
Started redo scan
Mon Aug 24 22:53:07 2009
Completed redo scan
53 redo blocks read, 30 data blocks need recovery
Mon Aug 24 22:53:07 2009
Started redo application at
Thread 1: logseq 10484, block 40586
Mon Aug 24 22:53:07 2009
Recovery of Online Redo Log: Thread 1 Group 7 Seq 10484 Reading mem 0
Mem# 0 errs 0: +DATADG/orcl/onlinelog/group_7.298.670427121
Mon Aug 24 22:53:08 2009
Completed redo application
Mon Aug 24 22:53:08 2009
Completed instance recovery at
Thread 1: logseq 10484, block 40639, scn 1479311755
30 data blocks read, 32 data blocks written, 53 redo blocks read
Switch log for thread 1 to sequence 10485
Mon Aug 24 22:53:27 2009
Reconfiguration started (old inc 12, new inc 14)
List of nodes:
0 1
Global Resource Directory frozen
Communication channels reestablished
* domain 0 valid = 1 according to instance 0
Mon Aug 24 22:53:27 2009
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Mon Aug 24 22:53:27 2009
LMS 0: 0 GCS shadows cancelled, 0 closed
Mon Aug 24 22:53:27 2009
LMS 1: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Mon Aug 24 22:53:28 2009
LMS 1: 11913 GCS shadows traversed, 4001 replayed
Mon Aug 24 22:53:28 2009
LMS 0: 11725 GCS shadows traversed, 4001 replayed
Mon Aug 24 22:53:28 2009
LMS 0: 11680 GCS shadows traversed, 4001 replayed
Mon Aug 24 22:53:28 2009
LMS 1: 11945 GCS shadows traversed, 4001 replayed
Mon Aug 24 22:53:28 2009
LMS 1: 11808 GCS shadows traversed, 4001 replayed
LMS 1: 239 GCS shadows traversed, 80 replayed
Mon Aug 24 22:53:28 2009
LMS 0: 8065 GCS shadows traversed, 2737 replayed
Mon Aug 24 22:53:28 2009
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
Tue Aug 25 02:11:36 2009
Thread 2 advanced to log sequence 6804
Current log# 12 seq# 6804 mem# 0: +DATADG/orcl/onlinelog/group_12.303.670427257
-------------------------

I checked the spu performance and I saw one oracle process i.e; SMON is utilising 86% CPU

CPU TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND
*1 ? 6378 oracle 241 20 17060M 18200K run 1951:13 86.48 86.33 ora_smon_orcl*

Please Help me in investigating this issue.
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Sep 23 2009
Added on Aug 26 2009
1 comment
4,325 views