So, I'm on a team that's been diving deep into developing a custom file system that plays nice with NFS. We've been putting it through the wringer with Vdbench to iron out any kinks. But, here's where things get spicy: we've hit a snag where data inconsistency pops up, but only under a pretty specific condition. Vdbench, for some reason, totally missed it.
Here's the deal: writing data is smooth sailing, no issues there. But, when we try to read data that's not aligned to 512 bytes, things go sideways. This isn't just a one-off; it's super easy to replicate. Literally, a single dd command lays it bare (I'll drop the command and its output below). Despite this, Vdbench didn't flag anything amiss.
[root@env6-client fs2]# echo '012345' > nums.txt; dd if=nums.txt iflag=direct status=none | xxd
0000000: 3031 3233 3435 0a 012345.
[root@env6-client fs2]# echo '012345' > nums.txt; dd if=nums.txt iflag=direct status=none bs=1 | xxd
0000000: ffff ffff ffff ff .......
And here's our Vdbench config , command and full output (the output isn't too long because it only runs for 10s so):
[root@env6-client fs2]# cat /home/vdbench/vdbench.conf
data_errors=1
messagescan=no
validate=(yes,read_after_write,no_preread)
hd=default,vdbench=/home/vdbench,user=root,shell=ssh
hd=hd1,system=localhost
fsd=fsd1,anchor=/mnt/fs2,depth=1,width=1,files=1,size=1522756,openflag=o_direct
fwd=format,threads=1
fwd=default,xfersize=(2,5,510,8,512,3,514,8,4094,8,4096,3,4098,8,8190,8,8192,3,8194,8,16382,8,16384,3,16386,8,524286,8,524288,3,524290,8),fileio=random,fileselect=random,rdpct=0,threads=1
fwd=fwd1,fsd=fsd1,host=hd1
rd=rd1,fwd=fwd*,fwdrate=max,format=no,elapsed=8,warmup=2,interval=1
[root@env6-client fs2]# /home/vdbench/vdbench -vr -f /home/vdbench/vdbench.conf
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Vdbench distribution: vdbench50406 Wed July 20 15:49:52 MDT 2016
For documentation, see 'vdbench.pdf'.
10:47:54.829 Created output directory '/mnt/fs2/output'
10:47:54.903 input argument scanned: '-vr'
10:47:54.904 input argument scanned: '-f/home/vdbench/vdbench.conf'
10:47:54.999 Anchor size: anchor=/mnt/fs2: dirs: 1; files: 1; bytes: 1.452m (1,522,756)
10:47:56.086 Starting slave: /home/vdbench/vdbench SlaveJvm -m 192.168.220.233 -n localhost-10-240229-10.47.54.804 -l hd1-0 -p 5570
10:47:56.432 All slaves are now connected
10:47:58.001 Starting RD=rd1; elapsed=8 warmup=2; fwdrate=max. For loops: None
Feb 29, 2024 .Interval. .ReqstdOps... ...cpu%... read ....read..... ....write.... ..mb/sec... mb/sec .xfer.. ...mkdir.... ...rmdir.... ...create... ....open.... ...close.... ...delete...
rate resp total sys pct rate resp rate resp read write total size rate resp rate resp rate resp rate resp rate resp rate resp
10:47:59.056 1 22.0 23.210 3.9 0.59 50.0 11.0 17.357 11.0 29.063 0.57 0.57 1.13 54086 0.0 0.000 0.0 0.000 0.0 0.000 1.0 11.130 0.0 0.000 0.0 0.000
10:48:00.025 2 19.0 35.517 4.8 0.64 47.4 9.0 28.992 10.0 41.389 1.07 1.57 2.63 145354 0.0 0.000 0.0 0.000 0.0 0.000 1.0 10.270 1.0 0.115 0.0 0.000
10:48:01.019 3 25.0 34.612 3.4 0.63 52.0 13.0 31.381 12.0 38.112 2.05 1.55 3.59 150732 0.0 0.000 0.0 0.000 0.0 0.000 1.0 4.519 1.0 0.039 0.0 0.000
10:48:02.056 4 23.0 40.033 2.7 0.56 47.8 11.0 30.020 12.0 49.212 1.07 1.07 2.14 97770 0.0 0.000 0.0 0.000 0.0 0.000 1.0 3.411 1.0 0.038 0.0 0.000
10:48:03.014 5 33.0 26.871 3.4 0.88 51.5 17.0 20.892 16.0 33.224 1.07 1.07 2.14 68111 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000
10:48:04.011 6 34.0 27.019 3.0 0.88 50.0 17.0 20.243 17.0 33.796 0.61 0.61 1.22 37497 0.0 0.000 0.0 0.000 0.0 0.000 1.0 4.134 1.0 0.042 0.0 0.000
10:48:05.053 7 44.0 21.679 2.6 1.00 50.0 22.0 15.658 22.0 27.700 0.62 0.62 1.24 29626 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000
10:48:06.013 8 19.0 41.997 2.9 0.56 47.4 9.0 32.945 10.0 50.144 1.54 2.04 3.59 197954 0.0 0.000 0.0 0.000 0.0 0.000 1.0 3.429 1.0 0.026 0.0 0.000
10:48:07.012 9 22.0 36.739 2.8 0.50 50.0 11.0 32.683 11.0 40.796 2.05 2.05 4.11 195862 0.0 0.000 0.0 0.000 0.0 0.000 1.0 3.459 1.0 0.030 0.0 0.000
10:48:08.053 10 25.0 34.817 2.6 0.69 52.0 13.0 30.603 12.0 39.381 1.57 1.07 2.63 110468 0.0 0.000 0.0 0.000 0.0 0.000 1.0 5.214 1.0 0.037 0.0 0.000
10:48:08.058 avg_3-10 27.1 31.544 3.1 0.70 50.0 13.6 25.383 13.6 37.705 1.29 1.29 2.59 100150 0.0 0.000 0.0 0.000 0.0 0.000 0.8 4.919 0.8 0.047 0.0 0.000
10:48:08.195 hd1-0: 10:48:08.193 Total amount of key blocks read and validated: 6,415,366; key blocks marked in error: 0
10:48:08.383
10:48:08.383 Miscellaneous statistics:
10:48:08.383 (These statistics do not include activity between the last reported interval and shutdown.)
10:48:08.383 WRITE_OPENS Files opened for write activity: 8 0/sec
10:48:08.384 FILE_CLOSES Close requests: 7 0/sec
10:48:08.384
10:48:09.198 Vdbench execution completed successfully. Output directory: /mnt/fs2/output
Has anyone else wrestled with something like this? Any insights on why Vdbench might've missed it or tips on tweaking our setup to catch these issues? If there's a better tool for this or some Vdbench ninja moves we're missing, I'm all ears.
Really appreciate any thoughts or advice you've got. Thanks a ton!