Skip to Main Content

Berkeley DB Family

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Berkeley DB file corrupted while operating for hours -- PANIC

715974Aug 6 2009 — edited Aug 25 2009
Hello, Guru,

I'm struggling with a Berkeley DB corruption for days. Can somebody give me a hand?

I'm developing a product which will operate thousands of Berkeley DB files (I increased limitation of opening files for a process.) on Ubuntu Linux.
While the product was running, it need high performance of put/get/lookup over these files, and Berkeley DB is the best choice to me so far.

But what happen now is, after working for couple of hours, some of BDB files will begin to corrupt. (I won't close DB file until exit the program.)
My program is multi-threading, but I allocate lock for each of DB handle. I've tried both 4.6.21 and 4.7.25 version.

The following are error messages I often got:

DB_PAGE_NOTFOUND: Requested page not found
page ####: illegal page type or format
PANIC: Invalid argument
DB_RUNRECOVERY: Fatal error, run database recovery
PANIC: fatal region error detected; run recovery

And I caught something by using valgrind, I'm not sure whether they are related:
==21834== Thread 11:
==21834== Syscall param pwrite64(buf) points to uninitialised byte(s)
==21834== at 0x4E39A68: (within /lib/libpthread-2.8.90.so)
==21834== by 0x516D755: __os_io (in /usr/lib/libdb-4.7.so)
==21834== by 0x515B9F1: (within /usr/lib/libdb-4.7.so)
==21834== by 0x515BC4D: __memp_bhwrite (in /usr/lib/libdb-4.7.so)
==21834== by 0x515A7BA: __memp_alloc (in /usr/lib/libdb-4.7.so)
==21834== by 0x515C66D: __memp_fget (in /usr/lib/libdb-4.7.so)
==21834== by 0x5127B70: __db_goff (in /usr/lib/libdb-4.7.so)
==21834== by 0x51329C5: __db_ret (in /usr/lib/libdb-4.7.so)
==21834== by 0x51153B7: __dbc_get (in /usr/lib/libdb-4.7.so)
==21834== by 0x5120EF4: __db_get (in /usr/lib/libdb-4.7.so)
==21834== by 0x51211FA: __db_get_pp (in /usr/lib/libdb-4.7.so)

==21834== Address 0x1cf0f008 is 144 bytes inside a block of size 8,272 alloc'd
==21834== at 0x4C265AE: malloc (vg_replace_malloc.c:207)
==21834== by 0x516AD27: __os_malloc (in /usr/lib/libdb-4.7.so)
==21834== by 0x513B07E: __env_alloc (in /usr/lib/libdb-4.7.so)
==21834== by 0x515A32F: __memp_alloc (in /usr/lib/libdb-4.7.so)
==21834== by 0x515C66D: __memp_fget (in /usr/lib/libdb-4.7.so)
==21834== by 0x512414F: __db_new (in /usr/lib/libdb-4.7.so)
==21834== by 0x51278E9: __db_poff (in /usr/lib/libdb-4.7.so)
==21834== by 0x50A48D5: __ham_add_el (in /usr/lib/libdb-4.7.so)
==21834== by 0x509437C: (within /usr/lib/libdb-4.7.so)
==21834== by 0x5117C26: __dbc_put (in /usr/lib/libdb-4.7.so)
==21834== by 0x5109940: __db_put (in /usr/lib/libdb-4.7.so)
==21834== by 0x511F274: __db_put_pp (in /usr/lib/libdb-4.7.so)

Verify a BDB saying:
db4.7_verify: Page 1: offpage item 1 has bad pgno 1188
db4.7_verify: Page 2: offpage item 1 has bad pgno 901
db4.7_verify: /STORAGE2/0026.9.db: DB_VERIFY_BAD: Database verification failed

My product is so relied on Berkeley DB, so that it is fatal. Please point me out how I can prevent this PANIC hell.

Thank you very much,

- Hanphy

Edited by: user11766808 on Aug 6, 2009 4:16 AM

Edited by: user11766808 on Aug 6, 2009 4:25 AM

Edited by: user11766808 on Aug 6, 2009 4:28 AM
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Sep 22 2009
Added on Aug 6 2009
8 comments
4,587 views