Skip to Main Content

Berkeley DB Family

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Understanding LockTimeOutExceptions better

vinothchandarOct 2 2012 — edited Oct 9 2012
Hi,

Background : We use BDB as the storage backend for Voldemort and we run on SSDs, and use a shared cache for multiple environments (upto 25 on a single server). Our writes to a txn with a RMW lock. We use duplicates, and hence use a transactional cursor to perform a get(), delete() , put() cycle. (I am working on making the duplicates go away though and we will be using a simple get put in a txn). Reads use UNCOMMITTED isolation. We don't have secondary indexes or anything.

What I am seeing is that there are heavy locktimeouts sometimes, which drives up the latency very high. Our locktimeout is 500ms. Considering we are SSDs, I would expect any operation to finish much quicker than 500 ms. For example, the following instance shows 10 waiters. Considering a 4 level tree and 1ms access time max for each Node fetch (data / index), all 10 waiters should have been granted access in under 40ms (just guessing).

[BdbStorageEngine] [voldemort-niosocket-server34] [voldemort] com.sleepycat.je.LockTimeoutException: (JE 4.0.92) Lock expired. Locker 1317118982 164639096_voldemort-niosocket-server34_Txn: waited for lock on database=message_sent_history LockAddr:265706381 node=170810678 type=WRITE grant=WAIT_NEW timeoutMillis=500 startTime=1348704310016 endTime=1348704310535
Owners: [<LockInfo locker="606847962 164639091_voldemort-niosocket-server44_Txn" type="WRITE"/>]
Waiters: [<LockInfo locker="893857731 164639092_voldemort-niosocket-server28_Txn" type="WRITE"/>, <LockInfo locker="1826240023 164639093_voldemort-niosocket-server1_Txn" type="WRITE"/>, <LockInfo locker="2142263792 164639094_voldemort-niosocket-server17_Txn" type="WRITE"/>, <LockInfo locker="1710362032 164639095_voldemort-niosocket-server33_Txn" type="WRITE"/>, <LockInfo locker="1317822219 164639098_voldemort-niosocket-server10_Txn" type="WRITE"/>, <LockInfo locker="153266016 164639099_voldemort-niosocket-server35_Txn" type="WRITE"/>, <LockInfo locker="861632969 164639101_voldemort-niosocket-server11_Txn" type="WRITE"/>, <LockInfo locker="1385676009 164639103_voldemort-niosocket-server31_Txn" type="WRITE"/>, <LockInfo locker="1744015195 164639104_voldemort-niosocket-server19_Txn" type="WRITE"/>, <LockInfo locker="1933099921 164639107_voldemort-niosocket-server32_Txn" type="WRITE"/>]

Or in other words, even locking on a node higher up the tree, should be resolved very soon right?

I know lowering the locktimeout could help. But would like to understand why we would get LockTimeoutExceptions in the first place?

Thanks
Vinoth
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Nov 6 2012
Added on Oct 2 2012
26 comments
3,303 views