Skip to Main Content

Infrastructure Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Solaris 11.4.42.111.0 - slow ZFS scrub performance

BonesMar 8 2022

Hi,
While scrubbing a ZFS storage volume scrub speed drops dramatically and timeleft increases to days. Using zpool monitor scrub, I get the following performance output:
TIMESTMP POOL PROVIDER PCTDONE SPEED TIMELEFT TOTAL
14:45:04 storage scrub 80.7 1.41M 11d09h 6.85T
14:50:04 storage scrub 80.7 1.44M 11d03h 6.85T
14:55:04 storage scrub 80.7 1.41M 11d09h 6.85T
15:00:04 storage scrub 80.7 1.38M 11d14h 6.85T
15:05:04 storage scrub 80.7 1.34M 11d22h 6.85T
15:10:04 storage scrub 80.7 1.3M 12d07h 6.85T
15:15:04 storage scrub 80.7 1.27M 12d16h 6.85T
iostat performance at the same time is:
extended device statistics ---- errors ---
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device
0.0 23.5 0.0 147.7 0.0 0.0 0.0 0.1 0 0 0 0 0 0 c1t0d0
66.7 7.6 146.6 23.5 0.0 0.4 0.0 5.4 0 28 0 0 0 0 c1t1d0
65.9 12.3 177.9 46.2 0.0 0.4 0.0 4.8 0 26 0 0 0 0 c1t2d0
62.0 7.8 145.0 23.5 0.0 0.4 0.0 5.4 0 25 0 0 0 0 c1t3d0
68.3 12.3 180.1 46.2 0.0 0.4 0.0 5.2 0 28 0 0 0 0 c1t4d0
62.5 7.9 143.1 23.5 0.0 0.4 0.0 5.6 0 26 0 0 0 0 c1t5d0
65.9 12.2 184.0 46.2 0.0 0.4 0.0 5.2 0 27 0 0 0 0 c1t6d0
extended device statistics ---- errors ---
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device
0.0 18.7 0.0 68.9 0.0 0.0 0.0 0.1 0 0 0 0 0 0 c1t0d0
50.5 11.5 89.3 82.5 0.0 0.3 0.0 5.5 0 22 0 0 0 0 c1t1d0
50.7 17.5 90.6 164.2 0.0 0.4 0.0 5.6 0 23 0 0 0 0 c1t2d0
51.6 10.9 93.8 82.5 0.0 0.3 0.0 5.5 0 22 0 0 0 0 c1t3d0
53.1 17.0 84.0 164.2 0.0 0.4 0.0 5.7 0 26 0 0 0 0 c1t4d0
53.6 11.7 91.5 82.5 0.0 0.4 0.0 6.1 0 26 0 0 0 0 c1t5d0
52.6 16.6 81.8 164.2 0.0 0.4 0.0 5.9 0 27 0 0 0 0 c1t6d0
extended device statistics ---- errors ---
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device
0.0 23.4 0.0 145.3 0.0 0.0 0.0 0.1 0 0 0 0 0 0 c1t0d0
30.6 8.0 50.3 24.1 0.0 0.3 0.0 6.7 0 21 0 0 0 0 c1t1d0
31.8 12.5 48.8 47.4 0.0 0.3 0.0 6.7 0 22 0 0 0 0 c1t2d0
29.9 8.3 49.3 24.1 0.0 0.3 0.0 6.8 0 21 0 0 0 0 c1t3d0
31.6 12.4 47.6 47.4 0.0 0.3 0.0 6.4 0 22 0 0 0 0 c1t4d0
26.4 8.0 45.4 24.1 0.0 0.3 0.0 7.4 0 18 0 0 0 0 c1t5d0
31.8 12.3 49.3 47.4 0.0 0.3 0.0 6.3 0 21 0 0 0 0 c1t6d0
extended device statistics ---- errors ---

Pool layout is as follows:
zpool status storage
pool: storage
id: 9500427662938662247
state: ONLINE
scan: scrub in progress since Tue Jan 1 01:04:37 2002
5.52T scanned out of 6.85T at 1.23M/s, 13d02h to go
0 repaired, 80.67% done
config:

NAME    STATE   READ WRITE CKSUM  
storage   ONLINE    0   0   0  
 mirror-0 ONLINE    0   0   0  
  c1t1d0 ONLINE    0   0   0  
  c1t3d0 ONLINE    0   0   0  
  c1t5d0 ONLINE    0   0   0  
 mirror-1 ONLINE    0   0   0  
  c1t2d0 ONLINE    0   0   0  
  c1t4d0 ONLINE    0   0   0  
  c1t6d0 ONLINE    0   0   0  

errors: No known data errors

ARC statistics are as follows (from arcstat.sh):
5823 478 92.41%
4999 622 88.93%
4258 473 90.00%
5999 488 92.48%
4229 520 89.05%
4378 475 90.21%
4159 519 88.91%
4376 625 87.50%
4560 516 89.83%
5600 535 91.28%
There are no faults in the system, there is nothing in the logs, and vmstat reports the following:
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id
1 0 0 117262772 115450936 2135 502 0 0 0 0 0 30 76 77 76 2180 12976 2374 0 1 98
2 0 0 104194588 102072468 1 9 0 0 0 0 0 19 42 34 37 1543 681 1240 0 0 100
2 0 0 104185988 102067724 31 199 0 0 0 0 0 22 61 56 63 1732 911 1546 0 1 99
1 0 0 104184272 102063496 0 0 0 0 0 0 0 15 114 111 110 2107 672 2074 0 1 99
1 0 0 104177408 102058288 32 198 0 0 0 0 0 30 178 165 176 2692 907 2884 0 1 99
1 0 0 104170500 102053348 0 0 0 0 0 0 0 16 67 62 66 1786 727 1603 0 1 99

System configuration is:
prtdiag
System Configuration: SUN MICROSYSTEMS SUN FIRE X4270 M2 SERVER
BIOS Configuration: American Megatrends Inc. 08060108 12/27/2010
BMC Configuration: IPMI 1.5 (KCS: Keyboard Controller Style)

==== Processor Sockets ====================================

Version Location Tag
-------------------------------- --------------------------
Intel(R) Xeon(R) CPU X5690 @ 3.47GHz CPU 0
Intel(R) Xeon(R) CPU X5690 @ 3.47GHz CPU 1

System Configuration: Oracle Corporation i86pc
Memory size: 147448 Megabytes
System Peripherals (Software Nodes):

The first 70% of the scrub was done in a day, now the system is just chugging along in a leisurely pace, completely idle for the most part. My expectation is that the scrub should just continue as fast as possible, given that there is no competition for resources.
Any insights on how I can better understand this poor performance?

Comments
Post Details
Added on Mar 8 2022
0 comments
415 views