Updated from 11.1 to 11.2 latest (pkg://solaris/entire@0.5.11,5.11-0.175.2.4.0.6.0:20141105T221724Z) and users are complaining about unpredictable I/O. My arc is way too small.. looking like nearly directIO happening on disks.
arc_summary.pl:
System Memory:
Physical RAM: 65485 MB
Free Memory : 55915 MB
LotsFree: 1023 MB
ZFS Tunables (/etc/system):
ARC Size:
Current Size: 255 MB (arcsize)
Target Size (Adaptive): 255 MB (c)
Min Size (Hard Limit): 255 MB (zfs_arc_min)
Max Size (Hard Limit): 64461 MB (zfs_arc_max)
ARC Size Breakdown:
Most Recently Used Cache Size: 92% 236 MB (p)
Most Frequently Used Cache Size: 7% 19 MB (c-p)
ARC Efficency:
Cache Access Total: 347364813
Cache Hit Ratio: 85% 295655603 [Defined State for buffer]
Cache Miss Ratio: 14% 51709210 [Undefined State for Buffer]
REAL Hit Ratio: 58% 202714691 [MRU/MFU Hits Only]
Data Demand Efficiency: 96%
Data Prefetch Efficiency: 17%
CACHE HITS BY CACHE LIST:
Anon: 26% 79393321 [ New Customer, First Cache Hit ]
Most Recently Used: 33% 100096705 (mru) [ Return Customer ]
Most Frequently Used: 34% 102617986 (mfu) [ Frequent Customer ]
Most Recently Used Ghost: 1% 4106987 (mru_ghost) [ Return Customer Evicted, Now Back ]
Most Frequently Used Ghost: 3% 9440604 (mfu_ghost) [ Frequent Customer Evicted, Now Back ]
CACHE HITS BY DATA TYPE:
Demand Data: 48% 143268545
Prefetch Data: 2% 7108003
Demand Metadata: 32% 96559788
Prefetch Metadata: 16% 48719267
CACHE MISSES BY DATA TYPE:
Demand Data: 10% 5400155
Prefetch Data: 64% 33316565
Demand Metadata: 5% 2913312
Prefetch Metadata: 19% 10079178
---------------------------------------------
root@nearline:~# echo '::arc' | mdb -k | egrep '^(size|arc_no_grow)'
size = 255 MB
arc_no_grow = 1
As you can see the system has 64GB RAM and is completely dedicated backup server (via NFS/SAMBA). No other apps want memory.. its just sitting there and the arc isnt claiming it.
root@nearline:~# vmstat 1
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id
0 0 0 47661296 37836248 6 128 0 2 138 0 181 101 0 0 103 41732 10880 70862 1 5 94
1 0 0 65028100 57338600 0 1900 0 0 399 0 484 106 0 0 104 42849 12655 46547 1 3 96
0 0 0 65026672 57338464 0 429 0 0 401 0 504 95 0 0 123 33351 16855 43719 2 2 96
0 0 0 65023828 57343316 0 529 0 0 396 0 562 93 0 0 104 18850 1420 40633 0 2 97
0 0 0 65013432 57318308 0 552 0 0 392 0 516 148 0 0 136 42416 2492 87907 0 4 96
1 0 0 65030928 57347144 0 361 0 0 400 0 545 78 0 0 83 9265 570 46956 0 1 99
Does anyone have any clues? I want to file a bug maybe but dont know how to further troubleshoot this as since solaris 10, zfs has been rock solid. Whats really even a further wrench is that for very short periods of time. The only thing unusual is that I have added and removed log devices.