Database recovery takes a very long time - should I be worried
827738Dec 31 2010 — edited Jan 4 2011I have an application that uses high availability. I had to kill one of the nodes. Now that I have restarted the node it is performing a recovery as expected. However the recovery has been running for over an hour now. The database is almost exactly 1gb in size. A stack trace shows me that the recovery is running, but it seems to be spending an awful lot of time in listFiles:
"main" prio=3 tid=0x08071800 nid=0x2 runnable [0xfe39c000]
java.lang.Thread.State: RUNNABLE
at java.io.UnixFileSystem.list(Native Method)
at java.io.File.list(File.java:973)
at java.io.File.list(File.java:1004)
at com.sleepycat.je.log.FileManager.listFiles(FileManager.java:636)
at com.sleepycat.je.log.FileManager.getFollowingFileNum(FileManager.java:555)
at com.sleepycat.je.log.FileReader$ReadWindow.fillNext(FileReader.java:1081)
at com.sleepycat.je.log.FileReader.readData(FileReader.java:758)
at com.sleepycat.je.log.FileReader.readNextEntryAllowExceptions(FileReader.java:258)
at com.sleepycat.je.log.FileReader.readNextEntry(FileReader.java:230)
at com.sleepycat.je.recovery.RecoveryManager.undoLNs(RecoveryManager.java:1051)
at com.sleepycat.je.recovery.RecoveryManager.buildTree(RecoveryManager.java:479)
at com.sleepycat.je.recovery.RecoveryManager.recover(RecoveryManager.java:174)
...
I am using version 4.0.103. The process has plenty of heap space available and is consistently consuming about 6% CPU. The other two nodes are still busy with reads and writes. Should I be worried or just patient?
Any advice much appreciated.
--
James