Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

ORA-03135: connection lost contact

Samuel RabiniApr 3 2012 — edited Apr 3 2012

Hi,

I work on an Oracle Database 11g Release 11.1.0.6.0 - 64bit Production With the Real Application Clusters option.

Tonight I experienced a very strange situation I'm not able to understand.

I've my RAC with these services:
EVODB (the rac one)
EVODB1 (point to node 1)
EVODB2 (point to node 2)
EVOREAD (node 2 is the preferred, node 1 is the available)
IPGW (node 1 is the preferred, node 2 is the available)

Read intensive applications use EVOREAD
Write intensive applications use IPGW

Web is using EVOREAD.
Tonight web (via php) was not able to connecto to the database, returning this error:

ORA-03135: connection lost contact

I then checked the services, but crsstat was perfect:

[oracle@dcsrv-evodb02 ~]$ crsstat
HA Resource                                   Target     State
-----------                                   ------     -----
ora.EVODB.EVODB1.inst                         ONLINE     ONLINE on dcsrv-evodb01
ora.EVODB.EVODB2.inst                         ONLINE     ONLINE on dcsrv-evodb02
ora.EVODB.EVOREAD.EVODB2.srv                  ONLINE     ONLINE on dcsrv-evodb02
ora.EVODB.EVOREAD.cs                          ONLINE     ONLINE on dcsrv-evodb02
ora.EVODB.IPGW.EVODB1.srv                     ONLINE     ONLINE on dcsrv-evodb01
ora.EVODB.IPGW.cs                             ONLINE     ONLINE on dcsrv-evodb01
ora.EVODB.db                                  ONLINE     ONLINE on dcsrv-evodb01
ora.dcsrv-evodb01.ASM1.asm                    ONLINE     ONLINE on dcsrv-evodb01
ora.dcsrv-evodb01.LISTENER_ASM_DCSRV-EVODB01.lsnr ONLINE     ONLINE on dcsrv-evodb01
ora.dcsrv-evodb01.LISTENER_DB_DCSRV-EVODB01.lsnr ONLINE     ONLINE on dcsrv-evodb01
ora.dcsrv-evodb01.gsd                         ONLINE     ONLINE on dcsrv-evodb01
ora.dcsrv-evodb01.ons                         ONLINE     ONLINE on dcsrv-evodb01
ora.dcsrv-evodb01.vip                         ONLINE     ONLINE on dcsrv-evodb01
ora.dcsrv-evodb02.ASM2.asm                    ONLINE     ONLINE on dcsrv-evodb02
ora.dcsrv-evodb02.LISTENER_ASM_DCSRV-EVODB02.lsnr ONLINE     ONLINE on dcsrv-evodb02
ora.dcsrv-evodb02.LISTENER_DB_DCSRV-EVODB02.lsnr ONLINE     ONLINE on dcsrv-evodb02
ora.dcsrv-evodb02.gsd                         ONLINE     ONLINE on dcsrv-evodb02
ora.dcsrv-evodb02.ons                         ONLINE     ONLINE on dcsrv-evodb02
ora.dcsrv-evodb02.vip                         ONLINE     ONLINE on dcsrv-evodb02

lsnrctl service as well:

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
Services Summary...
Service "EVODB" has 2 instance(s).
  Instance "EVODB1", status READY, has 2 handler(s) for this service...
    Handler(s):
      "N000" established:0 refused:0 current:0 max:679 state:ready
         CMON <machine: dcsrv-evodb01, pid: 6985>
         (ADDRESS=(PROTOCOL=tcp)(HOST=dcsrv-evodb01.xxxxx.xxx)(PORT=46498))
      "DEDICATED" established:52 refused:0 state:ready
         REMOTE SERVER
         (ADDRESS=(PROTOCOL=TCP)(HOST=dcsrv-evodb01-vip.xxxxx.xxx)(PORT=1521))
  Instance "EVODB2", status READY, has 3 handler(s) for this service...
    Handler(s):
      "N000" established:29 refused:0 current:70 max:679 state:ready
         CMON <machine: dcsrv-evodb02, pid: 26709>
         (ADDRESS=(PROTOCOL=tcp)(HOST=dcsrv-evodb02.xxxxx.xxx)(PORT=61966))
      "DEDICATED" established:0 refused:0 state:ready
         REMOTE SERVER
         (ADDRESS=(PROTOCOL=TCP)(HOST=dcsrv-evodb02-vip.xxxxx.xxx)(PORT=1521))
      "DEDICATED" established:290 refused:0 state:ready
         LOCAL SERVER
Service "EVODBXDB" has 2 instance(s).
  Instance "EVODB1", status READY, has 1 handler(s) for this service...
    Handler(s):
      "D000" established:0 refused:0 current:0 max:972 state:ready
         DISPATCHER <machine: dcsrv-evodb01, pid: 4494>
         (ADDRESS=(PROTOCOL=tcp)(HOST=dcsrv-evodb01.xxxxx.xxx)(PORT=57381))
  Instance "EVODB2", status READY, has 1 handler(s) for this service...
    Handler(s):
      "D000" established:0 refused:0 current:0 max:972 state:ready
         DISPATCHER <machine: dcsrv-evodb02, pid: 26499>
         (ADDRESS=(PROTOCOL=tcp)(HOST=dcsrv-evodb02.xxxxx.xxx)(PORT=34877))
Service "EVODB_XPT" has 2 instance(s).
  Instance "EVODB1", status READY, has 2 handler(s) for this service...
    Handler(s):
      "N000" established:0 refused:0 current:0 max:679 state:ready
         CMON <machine: dcsrv-evodb01, pid: 6985>
         (ADDRESS=(PROTOCOL=tcp)(HOST=dcsrv-evodb01.xxxxx.xxx)(PORT=46498))
      "DEDICATED" established:52 refused:0 state:ready
         REMOTE SERVER
         (ADDRESS=(PROTOCOL=TCP)(HOST=dcsrv-evodb01-vip.xxxxx.xxx)(PORT=1521))
  Instance "EVODB2", status READY, has 3 handler(s) for this service...
    Handler(s):
      "N000" established:29 refused:0 current:70 max:679 state:ready
         CMON <machine: dcsrv-evodb02, pid: 26709>
         (ADDRESS=(PROTOCOL=tcp)(HOST=dcsrv-evodb02.xxxxx.xxx)(PORT=61966))
      "DEDICATED" established:0 refused:0 state:ready
         REMOTE SERVER
         (ADDRESS=(PROTOCOL=TCP)(HOST=dcsrv-evodb02-vip.xxxxx.xxx)(PORT=1521))
      "DEDICATED" established:290 refused:0 state:ready
         LOCAL SERVER
Service "EVOREAD" has 1 instance(s).
  Instance "EVODB2", status READY, has 3 handler(s) for this service...
    Handler(s):
      "N000" established:29 refused:0 current:70 max:679 state:ready
         CMON <machine: dcsrv-evodb02, pid: 26709>
         (ADDRESS=(PROTOCOL=tcp)(HOST=dcsrv-evodb02.xxxxx.xxx)(PORT=61966))
      "DEDICATED" established:0 refused:0 state:ready
         REMOTE SERVER
         (ADDRESS=(PROTOCOL=TCP)(HOST=dcsrv-evodb02-vip.xxxxx.xxx)(PORT=1521))
      "DEDICATED" established:290 refused:0 state:ready
         LOCAL SERVER
Service "IPGW" has 1 instance(s).
  Instance "EVODB1", status READY, has 2 handler(s) for this service...
    Handler(s):
      "N000" established:0 refused:0 current:0 max:679 state:ready
         CMON <machine: dcsrv-evodb01, pid: 6985>
         (ADDRESS=(PROTOCOL=tcp)(HOST=dcsrv-evodb01.xxxxx.xxx)(PORT=46498))
      "DEDICATED" established:52 refused:0 state:ready
         REMOTE SERVER
         (ADDRESS=(PROTOCOL=TCP)(HOST=dcsrv-evodb01-vip.xxxxx.xxx)(PORT=1521))

I tried to connecto to database from webserver machine using sqlplus @EVOREAD and I had no problem!
I was able to query normally.
Another app that was using EVOREAD was running without any problem.

No error in the alert log of both nodes.

I then restarted the service EVOREAD. Once up again, sqlplus from webserver machines stopped to work, returning me:

ORA-30006: resource busy; acquire with WAIT timeout expired

While restarting EVOREAD serivice tens of this error has been written down into the alert log of node2:

Tue Apr 03 02:51:46 2012
ORA-30006 : opiodr aborting process L001 ospid (21941_46960412512672)
Tue Apr 03 02:51:46 2012
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x1D72F3ED4] [PC:0xE03F9D, opitsk()+6977]
Errors in file /u01/app/oracle/diag/rdbms/evodb/EVODB2/trace/EVODB2_l001_21941.trc  (incident=451907):
ORA-07445: exception encountered: core dump [opitsk()+6977] [SIGSEGV] [ADDR:0x1D72F3ED4] [PC:0xE03F9D] [Address not mapped to object] []
ORA-30006: resource busy; acquire with WAIT timeout expired
ORA-30006: resource busy; acquire with WAIT timeout expired
Incident details in: /u01/app/oracle/diag/rdbms/evodb/EVODB2/incident/incdir_451907/EVODB2_l001_21941_i451907.trc
Tue Apr 03 02:52:11 2012
k2g_dtp_stop_svc(): Error occured while stopping service [EVOREAD]; some transactions might not have been completely cleaned up

At the end I restart the instance on node2 using srvctl utility and suddenly the instance was shutdown with abort (without trying to close it normally as usual):

Tue Apr 03 02:58:36 2012
Shutting down instance (abort)
License high water mark = 74
USER (ospid: 25761): terminating the instance
Instance terminated by USER, pid = 25761
Tue Apr 03 02:58:41 2012
Instance shutdown complete

Once up and re-moved EVOREAD on node2 (during the instance restart it had been moved to node1), everything started to work fine again.

I really didn't understood the problem: at first look everything seemed to work fine (sqlplus, crsstat, a lor of other app).
What does the ORA-30006: after sqlplus mean?
Why, I start to get that error only after the restart of the service?

And, well, usually the instance reboot solve any kind of "process" or server resource problem...no doubt about that.

Any suggestion on how to detect the problem?

Thanks in advance

Locked Post

New comments cannot be posted to this locked post.

Locked on May 1 2012

Added on Apr 3 2012

#general-database-discussions

1 comment

719 views