Hi,
As per subject the master node in 3-node BDB JE HA group had failed with the following:
2019-01-14 19:07:19,296 ERROR [Feeder Output for node_3] (c.s.j.r.i.RepImpl) - [node_2] Halted log file reading at file 0x2c51 offset 0x34b793 offset(decimal)=3454867 prev=0x34b4f3:
entry=INS_LN_TXtype=33,version=8)
prev=0x34b4f3
size=3883
Next entry should be at 0x34c6d4
com.sleepycat.je.EnvironmentFailureException: (JE 5.0.104) want to read 151,849,824 but reader at 151,849,829 UNEXPECTED_STATE: Unexpected internal state, may have side effects.
at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426)
at com.sleepycat.je.rep.stream.FeederReader.checkForPassingTarget(FeederReader.java:288)
at com.sleepycat.je.rep.stream.FeederReader.isTargetEntry(FeederReader.java:309)
at com.sleepycat.je.log.FileReader.readNextEntryAllowExceptions(FileReader.java:297)
at com.sleepycat.je.log.FileReader.readNextEntry(FileReader.java:229)
at com.sleepycat.je.rep.stream.FeederReader.scanForwards(FeederReader.java:272)
at com.sleepycat.je.rep.stream.MasterFeederSource.getWireRecord(MasterFeederSource.java:64)
at com.sleepycat.je.rep.impl.node.Feeder$OutputThread.run(Feeder.java:761)
2019-01-14 19:07:19,298 ERROR [Feeder Output for node_3] (c.s.j.r.i.n.Feeder) - [node_2] Unexpected exception: (JE 5.0.104) want to read 151,849,824 but reader at 151,849,829 UNEXPECTED_STATE: Unexpected internal state, may have side effects. MasterFeederSource fetching vlsn=151,849,824 waitTime=1000com.sleepycat.je.EnvironmentFailureException: (JE 5.0.104) want to read 151,849,824 but reader at 151,849,829 UNEXPECTED_STATE: Unexpected internal state, may have side effects. MasterFeederSource fetching vlsn=151,849,824 waitTime=1000
at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426)
at com.sleepycat.je.rep.stream.FeederReader.checkForPassingTarget(FeederReader.java:288)
at com.sleepycat.je.rep.stream.FeederReader.isTargetEntry(FeederReader.java:309)
at com.sleepycat.je.log.FileReader.readNextEntryAllowExceptions(FileReader.java:297)
at com.sleepycat.je.log.FileReader.readNextEntry(FileReader.java:229)
at com.sleepycat.je.rep.stream.FeederReader.scanForwards(FeederReader.java:272)
at com.sleepycat.je.rep.stream.MasterFeederSource.getWireRecord(MasterFeederSource.java:64)
at com.sleepycat.je.rep.impl.node.Feeder$OutputThread.run(Feeder.java:761)
2019-01-14 19:07:19,299 ERROR [Feeder Output for node_3] (c.s.j.r.i.n.Feeder) - [node_2] Uncaught exception in feeder thread Thread[Feeder Output for node_3,5,main](JE 5.0.104) want to read 151,849,824 but reader at 151,849,829 UNEXPECTED_STATE: Unexpected internal state, may have side effects. MasterFeederSource fetching vlsn=151,849,824 waitTime=1000com.sleepycat.je.EnvironmentFailureException: (JE 5.0.104) want to read 151,849,824 but reader at 151,849,829 UNEXPECTED_STATE: Unexpected internal state, may have side effects. MasterFeederSource fetching vlsn=151,849,824 waitTime=1000
at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426)
at com.sleepycat.je.rep.stream.FeederReader.checkForPassingTarget(FeederReader.java:288)
at com.sleepycat.je.rep.stream.FeederReader.isTargetEntry(FeederReader.java:309)
at com.sleepycat.je.log.FileReader.readNextEntryAllowExceptions(FileReader.java:297)
at com.sleepycat.je.log.FileReader.readNextEntry(FileReader.java:229)
at com.sleepycat.je.rep.stream.FeederReader.scanForwards(FeederReader.java:272)
at com.sleepycat.je.rep.stream.MasterFeederSource.getWireRecord(MasterFeederSource.java:64)
at com.sleepycat.je.rep.impl.node.Feeder$OutputThread.run(Feeder.java:761)
It looks like master node "node_2" failed to read data from the disk in order to send it over to "node_3".
The master node detached after that. The impacted environment was restarted without any issue and was able to operate as Replica.
As per logs the BDB JE version in use is 5.0.104.
What could be root cause of the issue with unexpected read offset ( want to read 151,849,824 but reader at 151,849,829)? Is it an existing defect or some sort of environmental issue?
If it is a defect, would upgrade to the latest 7.x version can solve the problem?
Kind Regards,
Alex
Message was edited by: 940250