Hi All,
I'm looking at a vdbench failure that says we fail at start of workload on first read for BAD_READ_RETURN: 'Read was successful, but data buffer contents not changed' with a before/after pattern of "Front: dea96590 End: deadeeee" --> This seems to indicate a data validation problem, but let's not be so quick to claim this. Workload starts and we first get a 797 error reported by vdbench.
Sys:
18.04LTS Kernel 5.3
vdbench 50403
java 1.8
Workload (script6.txt):
data_errors=1
sd=sd1,lun=/dev/nvme6n1,align=4k
wd=wd1,sd=sd1,rdpct=100,seekpct=-1,xfersize=64k,openflags=o_direct
rd=run1,wd=wd1,iorate=max,threads=64,elapsed=115200,interval=60
vdbench output:
17:43:04.001 Starting RD=run1; I/O rate: Uncontrolled MAX; elapsed=115200; For loops: threads=64
17:43:04.333 17:43:04.333 localhost-0: 17:43:04.332 op: read lun: /dev/nvme6n1 lba: 131072 0x00020000 xfer: 65536 errno: BAD_READ_RETURN: 'Read was successful, but data buffer contents not changed'
17:43:04.836 killCommand: /vdbench/vdbench SlaveJvm -m localhost -n localhost-10-200519-17.43.03.086 -l localhost-0 -p 5582
17:43:04.836
17:43:04.836 'data_errors=1' requested. Abort rd=run1 after last error.
17:43:04.836
17:43:04.836 removeKillAtEnd(): Unable to find /vdbench/vdbench SlaveJvm -m localhost -n localhost-10-200519-17.43.03.086 -l localhost-0 -p 5582
java.lang.RuntimeException: 'data_errors=1' requested. Abort rd=run1 after last error.
at Vdb.common.failure(common.java:350)
at Vdb.common.failure(common.java:297)
at Vdb.ErrorLog.countErrorsOnMaster(ErrorLog.java:224)
at Vdb.SlaveOnMaster.processSlave(SlaveOnMaster.java:207)
at Vdb.SlaveOnMaster.run(SlaveOnMaster.java:42)
Std Out:
17:43:03.350 17:43:03.349 SlaveJvm execution parameter: '-m localhost'
17:43:03.350 17:43:03.349 SlaveJvm execution parameter: '-n localhost-10-200519-17.43.03.086'
17:43:03.350 17:43:03.350 SlaveJvm execution parameter: '-l localhost-0'
17:43:03.350 17:43:03.350 SlaveJvm execution parameter: '-p 5582'
17:43:03.350 17:43:03.350 SlaveJvm positional parameter: 'SlaveJvm'
17:43:03.448 17:43:03.447 successfully connected to master localhost
17:43:03.448 17:43:03.447 Connection to localhost using port 5582 successful
17:43:03.536 17:43:03.535 ticks_per_second: 100
17:43:03.547 17:43:03.547 Configuration interpretation took 0.01 seconds
17:43:03.679 17:43:03.679 Beginning of run setup
17:43:03.680 17:43:03.679 File '/var/log/messages' does not exist. Monitoring bypassed
17:43:03.680 17:43:03.679 **********************
17:43:03.680
17:43:03.680
17:43:03.680 17:43:03.679 File '/var/log/messages.1' does not exist. Monitoring bypassed
17:43:03.758 17:43:03.758 Opening sd=sd1,lun=/dev/nvme6n1; write: false; flags: OpenFlags: 0x00004000 OtherFlags: 0x00000000
17:43:03.770 17:43:03.770 Started 1 Workload Generator threads.
17:43:03.834 17:43:03.834 Started 64 i/o threads for sd1
17:43:03.834 17:43:03.834 Started a total of 64 i/o threads.
17:43:03.834 17:43:03.834 Waiting for task synchronization
17:43:03.835 17:43:03.834 Requesting full garbage collection
17:43:03.835 17:43:03.835 GcTracker: cum: 1 intv: 1 ms: 7 mss: 0.00% Heap_MB max: 911 curr: 252 used: 155 free: 96
17:43:03.855 17:43:03.855 GcTracker: cum: 3 intv: 2 ms: 18 mss: 0.00% Heap_MB max: 911 curr: 284 used: 17 free: 267
17:43:03.855 17:43:03.855 task_wait_start_complete() end
17:43:04.001 Starting RD=run1; I/O rate: Uncontrolled MAX; elapsed=115200; For loops: threads=64
17:43:04.002 17:43:04.002 task_run_all(): 65 tasks
17:43:04.329 17:43:04.328 Pre-formatted read buffer contents still found after successful read. Returning error 797
17:43:04.329 17:43:04.328 Negative response time. Usually caused by out of sync cpu timers.
17:43:04.329 17:43:04.329 Failure 2, buffer: 0x7fb784001000
17:43:04.329 17:43:04.329 Negative response time. Usually caused by out of sync cpu timers.
17:43:04.329 17:43:04.329 Will report a maximum of 100 times after which Vdbench will continue.
17:43:04.329 17:43:04.329 Will report a maximum of 100 times after which Vdbench will continue.
17:43:04.329 17:43:04.329 Front: 00110020 dea96590
17:43:04.329 17:43:04.329 Response time (microseconds): -1589928011188412
17:43:04.329 17:43:04.329 Response time (microseconds): -1589928011188663
17:43:04.329 17:43:04.329 End: 84001000 deadeeee
17:43:04.333 17:43:04.333 17:43:04.332 op: read lun: /dev/nvme6n1 lba: 131072 0x00020000 xfer: 65536 errno: BAD_READ_RETURN: 'Read was successful, but data buffer contents not changed'
17:43:04.436 17:43:04.436 Sent 64 interrupts to 64 waiting IO_task threads
17:43:04.437 17:43:04.436 Java Heap in MB. max: 910.500; current: 284.000; used: 28.442; free: 255.558
17:43:04.437 17:43:04.437 Maximum native memory allocation: 4,194,304; Current allocation: 0
17:43:04.437 17:43:04.437 End of run
17:43:04.437 17:43:04.437 **********
What is 797?
According to @"Henk Vandenbergh-Oracle" (correct me if I'm wrong): "The 797 check is actually not a Data Validation check, it is done for every single read operation, with or without DV being active... Reporting 'read' versus 'expected' data caused serious confusion, letting users think that "we read the wrong data". We did not read the wrong data, we did not read at all... THAT by the way is the main reason I introduced the 797 check, and the fact that the 797 also was done and worked without DV being active was an accidental and very lucky side effect."
Question:
So if the drive failed bc the data was never read, then this seems to point to a system issue, but what exactly? Should focus be on better JVM tuning or workload optimization? What is "Negative response time. Usually caused by out of sync cpu timers?"
Anything else I should be looking at? I looked at kernel log and syslog, but nothing stood out (let me know and I can post).
Thanks!!!
-Brad