asm weird errors
734109Mar 17 2010 — edited Mar 18 2010we have 2 SUN FIRE 6900 with cluster ware 10.1.0.4..2 asm and database
the node 1 runs forms reports services for solaris 10
the node 2 runs application server
node 1 and node 2 are running different databases node 1 spdb and node 2 turbodb
recently we have the following sequence of errors in the files following loss of service of reports server and eventually we have to reboot machine 1
ASM ALERT FILE
Wed Mar 17 09:51:12 2010
Errors in file /opt/orabase/asmhome/admin/+ASM/udump/+asm1_ora_5759.trc:
ORA-07445: exception encountered: core dump [__lwp_kill()+8] [SIGIOT] [unknown code] [0x167F00000000] [] []
Wed Mar 17 09:51:12 2010
Trace dumping is performing id=[cdmp_20100317095112]
DB ALERT
ORA-00060: Deadlock detected. More info in file /opt/orabase/dbhome/admin/SPISDB/udump/spisdb_ora_12196.trc.
Wed Mar 17 09:51:12 2010
Thread 1 advanced to log sequence 41853
Current log# 2 seq# 41853 mem# 0: +DATA/spisdb/onlinelog/group_22
Current log# 2 seq# 41853 mem# 1: +DATA/spisdb/onlinelog/group_2
Wed Mar 17 09:51:21 2010
Errors in file /opt/orabase/dbhome/admin/SPISDB/bdump/spisdb_arc3_6847.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '+DATA/spisdb/onlinelog/group_11'
ORA-17503: ksfdopn:2 Failed to open file +DATA/spisdb/onlinelog/group_11
ORA-03113: end-of-file on communication channel
Wed Mar 17 09:56:40 2010
ORA-00060: Deadlock detected. More info in file /opt/orabase/dbhome/admin/SPISDB/udump/spisdb_ora_4820.trc.
Wed Mar 17 09:56:41 2010
ORA-00060: Deadlock detected. More info in file /opt/orabase/dbhome/admin/SPISDB/udump/spisdb_ora_18667.trc.
Wed Mar 17 10:01:49 2010
ORA-00060: Deadlock detected. More info in file /opt/orabase/dbhome/admin/SPISDB/udump/spisdb_ora_1200.trc.
Wed Mar 17 10:17:31 2010
Thread 1 advanced to log sequence 41854
Current log# 3 seq# 41854 mem# 0: +DATA/spisdb/onlinelog/group_33
Current log# 3 seq# 41854 mem# 1: +DATA/spisdb/onlinelog/group_3
Wed Mar 17 10:22:14 2010
ORA-00060: Deadlock detected. More info in file /opt/orabase/dbhome/admin/SPISDB/udump/spisdb_ora_25503.trc.
Wed Mar 17 10:24:29 2010
ORA-00060: Deadlock detected. More info in file /opt/orabase/dbhome/admin/SPISDB/udump/spisdb_ora_11719.trc.
dmesg is full of these lines ....
Mar 17 10:23:03 e6900ap3 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci (scsi_vhci0):
Mar 17 10:23:03 e6900ap3 /scsi_vhci/ssd@g600a0b80002624f4000034224b11ee2f (ssd73): Command Timeout on path /ssm@0,0/pci@1b,700000/SUNW,qlc@2/fp@0,0 (fp2)
Mar 17 10:23:03 e6900ap3 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g600a0b80002624f4000034224b11ee2f (ssd73):
Mar 17 10:23:03 e6900ap3 SCSI transport failed: reason 'timeout': retrying command
Mar 17 10:26:36 e6900ap3 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g600a0b80002624f4000034224b11ee2f (ssd73):
Mar 17 10:26:36 e6900ap3 SCSI transport failed: reason 'tran_err': retrying command
-bash-3.00$ cat /opt/orabase/asmhome/admin/+ASM/udump/+asm1_ora_5759.trc
/opt/orabase/asmhome/admin/+ASM/udump/+asm1_ora_5759.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
ORACLE_HOME = /opt/orabase/asmhome
System name: SunOS
Node name: e6900ap3
Release: 5.10
Version: Generic_118833-20
Machine: sun4u
Instance name: +ASM1
Redo thread mounted by this instance: 0 <none>
Oracle process number: 0
Unix process pid: 5759, image: oracle@e6900ap3
Exception signal: 6 (SIGIOT), code: -1 (unknown code), addr: 0x167f00000000, exception issued by pid: 5759, uid: 200, PC: [0xffffffff7aace500, __lwp_kill()+8]
*** 2010-03-17 09:51:12.165
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [__lwp_kill()+8] [SIGIOT] [unknown code] [0x167F00000000] [] []
Current SQL information unavailable - no session.
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp()+744 CALL ksedst() 000000840 ?
FFFFFFFF7FFF517C ?
000000000 ?
FFFFFFFF7FFF1C70 ?
FFFFFFFF7FFF09D8 ?
FFFFFFFF7FFF13D8 ?
ssexhd()+1000 CALL ksedmp() 000106000 ? 106323304 ?
106323000 ? 000106323 ?
000106000 ? 106323304 ?
__sighndlr()+12 PTR_CALL 0000000000000000 000380007 ?
FFFFFFFF7FFF8EF0 ?
000000067 ? 000380000 ?
000000006 ? 106323300 ?
call_user_handler() CALL __sighndlr() 000000006 ?
+992 FFFFFFFF7FFF8EF0 ?
FFFFFFFF7FFF8C10 ?
10032D860 ? 000000000 ?
000000005 ?
raise()+16 CALL pthread_kill() FFFFFFFF7AC02000 ?
........
has anyone of you guys seen something like this
personally i manage the machine from 2000 miles away so there is no chance for a physical look