10.2.0.3.0 RH 5
customer had been doing some SAN work and caused a database problem, listener wouldnt start. Got called to look at it.
ok, lsnrctl status shows its up, I logged into database as sysdba query v$instance, all open and good.
SQL> select * from v$instance;
INSTANCE_NUMBER INSTANCE_NAME
--------------- ----------------
HOST_NAME
----------------------------------------------------------------
VERSION STARTUP_T STATUS PAR THREAD# ARCHIVE LOG_SWITCH_WAIT
----------------- --------- ------------ --- ---------- ------- ---------------
LOGINS SHU DATABASE_STATUS INSTANCE_ROLE ACTIVE_ST BLO
---------- --- ----------------- ------------------ --------- ---
1 DWH01
DWHBOX
10.2.0.3.0 08-MAY-12 OPEN NO 1 STOPPED
ALLOWED NO ACTIVE PRIMARY_INSTANCE NORMAL NO
SQL>
ok, whats the problem? user cant login with a TNS 12530 error, thats strange lets look at the alert log, (this is a test datawarehouse instance with no arc log mode) ... nothing in it since last night some time.
ok, lets see what oracle processes are running. whats that? No oracle processes!
>
[root@DWHBOX ~]# ps -ef | grep pmon
[root@DWHBOX ~]#
>
no, that cant be right, I just logged on to the database, lets have a look at all oracle processes
>
ps -ef | grep ora
root 5067 4933 0 14:45 pts/4 00:00:00 su - oracle
oracle 5068 5067 0 14:45 pts/4 00:00:00 -bash
oracle 5090 5068 0 14:46 pts/4 00:00:00 sqlplus as sysdba
oracle 5091 5090 0 14:46 ? 00:00:00 oracleDWH01 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
root 5113 3065 0 14:46 pts/3 00:00:00 grep ora
root 15237 3823 0 Apr27 ? 00:00:04 sshd: oracle [priv]
oracle 15281 15237 0 Apr27 ? 00:00:06 sshd: oracle@pts/0
oracle 15282 15281 0 Apr27 pts/0 00:00:00 -bash
[root@DWHBOX ~]#
>
woooh, hold on, NO oracle processes at all except for the listener. I must have a typo, stop the listener and do an lsof | grep oracle to see what open files there are, if I can query v$instance then at least the controlfile must be open. (sits on /u01)
>
[root@DWHBOX ~]# lsof | grep oracle
[root@DWHBOX ~]#
>
nothing! I now officially have a ghost database. /shudder/
I know, lets go look at the v$process
>
oracle@DWHBOX ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.3.0 - Production on Tue May 8 14:46:04 2012
Copyright (c) 1982, 2006, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Release 10.2.0.3.0 - Production
SQL>
SQL>
SQL> desc v$process
Name Null? Type
----------------------------------------- -------- ----------------------------
ADDR RAW(4)
PID NUMBER
SPID VARCHAR2(12)
USERNAME VARCHAR2(15)
SERIAL# NUMBER
TERMINAL VARCHAR2(30)
PROGRAM VARCHAR2(48)
TRACEID VARCHAR2(255)
BACKGROUND VARCHAR2(1)
LATCHWAIT VARCHAR2(8)
LATCHSPIN VARCHAR2(8)
PGA_USED_MEM NUMBER
PGA_ALLOC_MEM NUMBER
PGA_FREEABLE_MEM NUMBER
PGA_MAX_MEM NUMBER
SQL> select addr, program from v$process;
ADDR PROGRAM
-------- ------------------------------------------------
7F2B2758 PSEUDO
7F2B2D10 oracle@DWHBOX (PMON)
7F2B32C8 oracle@DWHBOX (PSP0)
7F2B3880 oracle@DWHBOX (MMAN)
7F2B3E38 oracle@DWHBOX (DBW0)
7F2B43F0 oracle@DWHBOX (LGWR)
7F2B49A8 oracle@DWHBOX (CKPT)
7F2B4F60 oracle@DWHBOX (SMON)
7F2B5518 oracle@DWHBOX (RECO)
7F2B5AD0 oracle@DWHBOX (CJQ0)
7F2B6088 oracle@DWHBOX (MMON)
ADDR PROGRAM
-------- ------------------------------------------------
7F2B6640 oracle@DWHBOX (MMNL)
7F2B6BF8 oracle@DWHBOX (D000)
7F2B71B0 oracle@DWHBOX (S000)
7F2B7768 oracle@DWHBOX (q000)
7F2B7D20 oracle@DWHBOX (QMNC)
7F2B82D8 oracle@DWHBOX (q001)
7F2B8890 oracle@DWHBOX (J000)
7F2B8E48 oracle@DWHBOX (TNS V1-V3)
>
theres the pmon, lets get the PID of pmon and track that down on the OS
>
SQL> select pid, program from v$process;
select pid, program from v$process
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
SQL>
>
Ouch, so trying to get the OS PID crashed my session, lets disconnect, sqlplus back in, but now Im on a dead database, so it looks like trying to query the PID killed whatever ghost instance was there
>
SQL> exit
Disconnected from Oracle Database 10g Release 10.2.0.3.0 - Production
[oracle@DWHBOX ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.3.0 - Production on Tue May 8 14:47:35 2012
Copyright (c) 1982, 2006, Oracle. All Rights Reserved.
Connected to an idle instance.
SQL>
>
startup as normal and we're open for business and pmon an all other processes are up.
Can anyone explain that?
Edited by: deebee_eh on May 8, 2012 3:31 PM
Corrected typo: ora 12530 was the original error reported, not 12503.