Skip to Main Content

Chinese

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

关于PMON failed to acquire latch的问题

957498Sep 4 2012 — edited Sep 5 2012
平台:oracle 11.1.0.7 rac for linux 5.3 x86-64

1节点alter.log报如下错误:
Wed Sep 05 02:03:24 2012
PMON failed to acquire latch, see PMON dump
Wed Sep 05 02:04:06 2012


***********************************************************************
Wed Sep 05 02:04:06 2012
Wed Sep 05 02:04:06 2012


***********************************************************************

***********************************************************************


Fatal NI connect error 12170.

Fatal NI connect error 12170.

Fatal NI connect error 12170.

VERSION INFORMATION:
TNS for Linux: Version 11.1.0.7.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production

VERSION INFORMATION:
TNS for Linux: Version 11.1.0.7.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production

VERSION INFORMATION:
TNS for Linux: Version 11.1.0.7.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
Time: 05-SEP-2012 02:04:06
Time: 05-SEP-2012 02:04:06
Tracing not turned on.
Time: 05-SEP-2012 02:04:06
Tracing not turned on.
Tns error struct:
ns main err code: 12535
Tns error struct:
Tracing not turned on.

ns main err code: 12535
Tns error struct:

ns main err code: 12535

TNS-12535: TNS:operation timed out
TNS-12535: TNS:operation timed out
TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
ns secondary err code: 12606
ns secondary err code: 12606
nt secondary err code: 0
nt main err code: 0
nt main err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.84.85)(PORT=42805))
nt secondary err code: 0
nt secondary err code: 0
WARNING: inbound connection timed out (ORA-3136)
nt OS err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.84.84)(PORT=54506))
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.84.70)(PORT=56320))
WARNING: inbound connection timed out (ORA-3136)
WARNING: inbound connection timed out (ORA-3136)
Wed Sep 05 02:04:06 2012


***********************************************************************

Fatal NI connect error 12170.

VERSION INFORMATION:
TNS for Linux: Version 11.1.0.7.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
Time: 05-SEP-2012 02:04:06
Tracing not turned on.
Tns error struct:
ns main err code: 12535

Wed Sep 05 02:04:06 2012


***********************************************************************
TNS-12535: TNS:operation timed out

Fatal NI connect error 12170.
ns secondary err code: 12606

VERSION INFORMATION:
TNS for Linux: Version 11.1.0.7.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
nt main err code: 0
nt secondary err code: 0
Time: 05-SEP-2012 02:04:06
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.84.84)(PORT=54513))
Tracing not turned on.
WARNING: inbound connection timed out (ORA-3136)
Tns error struct:
ns main err code: 12535

TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.84.84)(PORT=54510))
WARNING: inbound connection timed out (ORA-3136)
Wed Sep 05 02:04:06 2012


***********************************************************************

Fatal NI connect error 12170.

VERSION INFORMATION:
TNS for Linux: Version 11.1.0.7.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
Time: 05-SEP-2012 02:04:06
Tracing not turned on.
Tns error struct:
ns main err code: 12535

TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.84.70)(PORT=56322))
WARNING: inbound connection timed out (ORA-3136)
Wed Sep 05 02:04:06 2012


***********************************************************************

Fatal NI connect error 12170.

VERSION INFORMATION:
TNS for Linux: Version 11.1.0.7.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
Time: 05-SEP-2012 02:04:06
Tracing not turned on.
Tns error struct:
ns main err code: 12535

TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.84.85)(PORT=42803))
WARNING: inbound connection timed out (ORA-3136)


PMON DUMP文件zzbrac31_pmon_9521.trc内容如下:

Trace file /u01/app/11.1.0/diag/rdbms/zzbrac3/zzbrac31/trace/zzbrac31_pmon_9521.trc
Oracle Database 11g Release 11.1.0.7.0 - 64bit Production
With the Real Application Clusters option
ORACLE_HOME = /u01/app/11.1.0/db
System name: Linux
Node name: zzbrac31
Release: 2.6.18-128.el5
Version: #1 SMP Wed Dec 17 11:41:38 EST 2008
Machine: x86_64
Instance name: zzbrac31
Redo thread mounted by this instance: 1
Oracle process number: 2
Unix process pid: 9521, image: oracle@zzbrac31 (PMON)


*** 2012-09-05 02:02:53.628
*** SESSION ID:(664.1) 2012-09-05 02:02:53.628
*** CLIENT ID:() 2012-09-05 02:02:53.628
*** SERVICE NAME:(SYS$BACKGROUND) 2012-09-05 02:02:53.628
*** MODULE NAME:() 2012-09-05 02:02:53.628
*** ACTION NAME:() 2012-09-05 02:02:53.628

PMON unable to acquire latch 6000fb20 active service list level=0
Location from where latch is held: ksws2.h LINE:315 ID:kswsgsnp: get service name ptr:
Context saved from call: 0
state=busy(shared) [value=0x4000000000000001] wlstate=free [value=0]
waiters [orapid (seconds since: put on list, posted, alive check)]:
265 (63, 1346781773, 4)
42 (63, 1346781773, 63)
47 (62, 1346781773, 62)
25 (62, 1346781773, 62)
76 (57, 1346781773, 2)
268 (56, 1346781773, 56)
51 (56, 1346781773, 56)
44 (50, 1346781773, 50)
43 (50, 1346781773, 50)
68 (42, 1346781773, 2)
272 (34, 1346781773, 34)
274 (7, 1346781773, 7)
277 (7, 1346781773, 7)
278 (2, 1346781773, 2)
waiter count=14
gotten 88124516 times wait, failed first 163034 sleeps 157164
gotten 4556574 times nowait, failed: 2766
Short stack dump:


*** 2012-09-05 02:02:54.019
<-ksedsts()+315<-kslgess()+2610<-ksl_get_shared_latch()+610<-kswsgpbr()+98<-kmmsvcgu()+68<-kmmlrl()+3359<-ksucln()+1673<-ksbrdp()+1487<-opirip()+609<-opidrv()+554<-sou2
o()+90<-opimai_real()+275<-ssthrdmain()+177<-main()+215<-__libc_start_main()+244<-_start()+41
possible holder pid = 69 ospid=23839
----------------------------------------
SO: 0x470f678e8, type: 2, owner: (nil), flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x470f678e8, name=process, file=ksu.h LINE:10706, pg=0
(process) Oracle pid:69, ser:95, calls cur/top: (nil)/0x4d67f5678
flags : (0x0) -
flags2: (0x0), flags3: (0x0)
int error: 0, call error: 0, sess error: 0, txn error 0
ksudlp FALSE at location: 0
(post info) last post received: 0 0 0
last post received-location: No post
last process to post me: none
last post sent: 0 0 0
last post sent-location: No post
last process posted by me: none
(latch info) wait_event=0 bits=1
holding (efd=4) 6000fb20 active service list level=0
Location from where latch is held: ksws2.h LINE:315 ID:kswsgsnp: get service name ptr:
Context saved from call: 0
state=busy(shared) [value=0x4000000000000001] wlstate=free [value=0]
waiters [orapid (seconds since: put on list, posted, alive check)]:
265 (64, 1346781774, 5)
42 (64, 1346781774, 64)
47 (63, 1346781774, 63)
25 (63, 1346781774, 63)
76 (58, 1346781774, 3)
268 (57, 1346781774, 57)
51 (57, 1346781774, 57)
44 (51, 1346781774, 51)
43 (51, 1346781774, 51)
68 (43, 1346781774, 3)
272 (35, 1346781774, 35)
274 (8, 1346781774, 8)
277 (8, 1346781774, 8)
278 (3, 1346781774, 3)
waiter count=14
Process Group: DEFAULT, pseudo proc: 0x480f485b8
O/S info: user: oracle, term: UNKNOWN, ospid: 23839
OSD pid info: Unix process pid: 23839, image: oracle@zzbrac31

*** 2012-09-05 02:03:24.032
Short stack dump: ORA-32518: cannot wait for process 'Unix process pid: 23839, image: oracle@zzbrac31' to finish executing ORADEBUG command 'SHORT_STACK' (waited 30000
ms); total wait time exceeds 30000 ms

Dump of memory from 0x0000000470F45AC0 to 0x0000000470F45CC8
470F45AC0 00000000 00000000 00000000 00000000 [................]
Repeat 31 times
470F45CC0 00000000 00000000 [........]

*** 2012-09-05 02:03:24.032

另一节点无报错。

在mos上查了一下,很像

Bug 6918493 和Bug 8502963

请帮忙分析一下,不胜感激!
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Oct 3 2012
Added on Sep 4 2012
4 comments
773 views