Skip to Main Content

Infrastructure Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Getting hung database after OEL upgrade, BUG: soft lockup - CPU#1 stuck

User12609554-OracleNov 15 2010 — edited Nov 17 2010
I ran a yum update on an X2100 system, so it now is running
Linux version 2.6.18-194.17.4.0.1.el5 (mockbuild@ca-build9.us.oracle.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Tue Oct 26 20:10:33 EDT 2010

The update installed
oracleasm-support - 2.1.3-1.el5.x86_64
oracleasm-2.6.18-92.el5 - 2.0.5-1.el5.x86_64

Then I updated oraclesm to match the kernel.

oracleasm-2.6.18-194.17.4.0.1.el5-2.0.5-1.el5.x86_64

Initially the systems boots up OK and the Oracle database runs fine.

But I get some sort of file corruption which hangs the database, and forces a reboot and manual fsck

Here are some messages:
Nov 13 10:48:46 aus-perfdb kernel: BUG: soft lockup - CPU#1 stuck for 65s! [swapper:0]
Nov 13 10:48:46 aus-perfdb kernel: CPU 1:
Nov 13 10:48:46 aus-perfdb kernel: Modules linked in: nfs fscache nfsd exportfs nfs_acl ipv6 xfrm_nalgo oracleasm(U) autofs4 hidp rfcomm l2cap bluetooth rpcsec_gss_krb5 auth_rpcgss testmgr_cipher testmgr aead crypto_blkcipher crypto_algapi crypto_api des lockd sunrpc cpufreq_ondemand powernow_k8 freq_table dm_multipath scsi_dh video backlight sbs power_meter i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport joydev sr_mod i2c_amd756 cdrom k8temp k8_edac i2c_amd8111 i2c_core e1000 hwmon edac_mc serio_raw amd_rng pcspkr sg dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod usb_storage shpchp sata_mv libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Nov 13 10:48:46 aus-perfdb kernel: Pid: 0, comm: swapper Tainted: G 2.6.18-194.17.4.0.1.el5 #1
Nov 13 10:48:46 aus-perfdb kernel: RIP: 0010:[<ffffffff80064b50>] [<ffffffff80064b50>] spinunlock_irqrestore+0x8/0x9
Nov 13 10:48:46 aus-perfdb kernel: RSP: 0018:ffff8101070efd48 EFLAGS: 00000246
Nov 13 10:48:46 aus-perfdb kernel: RAX: 0000000000000000 RBX: ffff8103ffa81000 RCX: 0000000000000001
Nov 13 10:48:46 aus-perfdb kernel: RDX: 0000000000000282 RSI: 0000000000000246 RDI: ffff8103ffa81050
Nov 13 10:48:46 aus-perfdb kernel: RBP: ffff8101070efcc0 R08: ffff8101ff03f3f0 R09: ffff81020726c000
Nov 13 10:48:46 aus-perfdb kernel: R10: ffff8101c20c9288 R11: ffffffff80044ffe R12: ffffffff8005dc8e
Nov 13 10:48:46 aus-perfdb kernel: R13: ffff8101a2faf9c0 R14: ffffffff8007821b R15: ffff8101070efcc0
Nov 13 10:48:46 aus-perfdb kernel: FS: 00002b3bb3b3bc90(0000) GS:ffff81010709a440(0000) knlGS:0000000000000000
Nov 13 10:48:46 aus-perfdb kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Nov 13 10:48:46 aus-perfdb kernel: CR2: 00002addc57ff000 CR3: 00000001d0753000 CR4: 00000000000006e0
Nov 13 10:48:46 aus-perfdb kernel:
Nov 13 10:48:46 aus-perfdb kernel: Call Trace:
Nov 13 10:48:46 aus-perfdb kernel: <IRQ> [<ffffffff88075c70>] :scsi_mod:scsi_dispatch_cmd+0x27d/0x2ff
Nov 13 10:48:46 aus-perfdb kernel: [<ffffffff8807b174>] :scsi_mod:scsi_request_fn+0x2c1/0x390
Nov 13 10:48:46 aus-perfdb kernel: [<ffffffff8005c1f7>] blk_run_queue+0x41/0x73
Nov 13 10:48:46 aus-perfdb kernel: [<ffffffff8807979d>] :scsi_mod:scsi_run_queue+0x155/0x1bf
Nov 13 10:48:46 aus-perfdb kernel: [<ffffffff88079efe>] :scsi_mod:scsi_next_command+0x2d/0x39
Nov 13 10:48:46 aus-perfdb kernel: [<ffffffff8807a07d>] :scsi_mod:scsi_end_request+0xbf/0xcd
Nov 13 10:48:46 aus-perfdb kernel: [<ffffffff8807a1d9>] :scsi_mod:scsi_io_completion+0x14e/0x324
Nov 13 10:48:46 aus-perfdb kernel: [<ffffffff880a7802>] :sd_mod:sd_rw_intr+0x252/0x28c
Nov 13 10:48:46 aus-perfdb smartd[7328]: Device: /dev/sdal, failed to read SMART Attribute Data
Nov 13 10:48:46 aus-perfdb kernel: [<ffffffff8807a46e>] :scsi_mod:scsi_device_unbusy+0x67/0x81
Nov 13 10:48:46 aus-perfdb smartd[7328]: Sending warning via mail to root ...
Nov 13 10:48:46 aus-perfdb kernel: [<ffffffff80037ca3>] blk_done_softirq+0x5f/0x6d
Nov 13 10:48:47 aus-perfdb kernel: [<ffffffff8001244a>] __do_softirq+0x89/0x133
Nov 13 10:48:47 aus-perfdb kernel: [<ffffffff8005e2fc>] call_softirq+0x1c/0x28
Nov 13 10:48:47 aus-perfdb kernel: [<ffffffff8006cba6>] do_softirq+0x2c/0x85
Nov 13 10:48:47 aus-perfdb kernel: [<ffffffff8006ca2e>] do_IRQ+0xec/0xf5
Nov 13 10:48:47 aus-perfdb kernel: [<ffffffff8006b35e>] default_idle+0x0/0x50
Nov 13 10:48:47 aus-perfdb kernel: [<ffffffff8005d615>] ret_from_intr+0x0/0xa
Nov 13 10:48:47 aus-perfdb kernel: <EOI> [<ffffffff8006b387>] default_idle+0x29/0x50
Nov 13 10:48:47 aus-perfdb kernel: [<ffffffff8004920e>] cpu_idle+0x95/0xb8
Nov 13 10:48:47 aus-perfdb kernel: [<ffffffff80077987>] start_secondary+0x498/0x4a7
Nov 13 10:48:47 aus-perfdb kernel:
Nov 13 10:48:47 aus-perfdb kernel: sd 37:0:0:0: timing out command, waited 60s

Patches? Tweaks?
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Dec 15 2010
Added on Nov 15 2010
7 comments
1,721 views