Hi,
I have a strange load average I can't explain on a Linux server.
I run RHEL 5.8 with two 10-cores Intel CPUs (Hyperthreading enabled) so I see 40 "CPU"s when looking at /proc/cpuinfo.
The load average is always above 20 so as far I understand the load average, there are more than 20 processes either active on CPU, waiting in the run queue.
The load is very stable during business day keeping load average to 20 with I/O wait between 20 and 25% and %user CPU close to 1%.
If I look at the CPU activity using top I have about 23,7% of CPU used (most in I/O wait).
Using sar, I see that the run queue contains only 1 process.
So how can the load average be 20 and not something close to 10?
Below is an sample output from top and sar:
top - 10:25:12 up 411 days, 23:05, 3 users, load average: 22.72, 22.64, 22.66
Tasks: 858 total, 1 running, 857 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.9%us, 0.3%sy, 0.0%ni, 76.2%id, 22.5%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 132028164k total, 106315820k used, 25712344k free, 20927060k buffers
Swap: 33554424k total, 1348108k used, 32206316k free, 72105916k cached
top - 10:25:31 up 411 days, 23:05, 3 users, load average: 22.51, 22.60, 22.64
Tasks: 858 total, 1 running, 857 sleeping, 0 stopped, 0 zombie
Cpu0 : 32.5%us, 10.9%sy, 0.0%ni, 54.7%id, 0.7%wa, 0.0%hi, 1.2%si, 0.0%st
Cpu1 : 8.2%us, 2.5%sy, 0.0%ni, 87.6%id, 1.4%wa, 0.0%hi, 0.2%si, 0.0%st
Cpu2 : 3.9%us, 1.1%sy, 0.0%ni, 93.5%id, 1.5%wa, 0.0%hi, 0.1%si, 0.0%st
Cpu3 : 1.9%us, 0.3%sy, 0.0%ni, 96.2%id, 1.5%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 1.7%us, 0.3%sy, 0.0%ni, 96.5%id, 1.5%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 2.3%us, 0.5%sy, 0.0%ni, 93.9%id, 3.3%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 1.6%us, 0.3%sy, 0.0%ni, 96.5%id, 1.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 1.6%us, 0.3%sy, 0.0%ni, 96.5%id, 1.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu8 : 1.6%us, 0.3%sy, 0.0%ni, 96.5%id, 1.6%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu9 : 1.5%us, 0.3%sy, 0.0%ni, 96.6%id, 1.6%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu10 : 2.2%us, 0.4%sy, 0.0%ni, 39.8%id, 57.5%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu11 : 2.0%us, 0.3%sy, 0.0%ni, 95.4%id, 2.2%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu12 : 2.0%us, 0.3%sy, 0.0%ni, 95.5%id, 2.2%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu13 : 6.7%us, 0.7%sy, 0.0%ni, 87.1%id, 5.4%wa, 0.0%hi, 0.1%si, 0.0%st
Cpu14 : 7.3%us, 0.9%sy, 0.0%ni, 84.8%id, 6.9%wa, 0.0%hi, 0.1%si, 0.0%st
Cpu15 : 3.4%us, 0.5%sy, 0.0%ni, 93.3%id, 2.8%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu16 : 3.5%us, 0.5%sy, 0.0%ni, 93.3%id, 2.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu17 : 3.6%us, 0.5%sy, 0.0%ni, 93.2%id, 2.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu18 : 3.4%us, 0.5%sy, 0.0%ni, 93.6%id, 2.6%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu19 : 2.8%us, 0.4%sy, 0.0%ni, 94.3%id, 2.5%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu20 : 2.5%us, 0.3%sy, 0.0%ni, 95.2%id, 1.9%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu21 : 2.4%us, 0.4%sy, 0.0%ni, 40.4%id, 56.8%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu22 : 2.2%us, 0.4%sy, 0.0%ni, 40.6%id, 56.9%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu23 : 2.1%us, 0.3%sy, 0.0%ni, 95.9%id, 1.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu24 : 2.0%us, 0.3%sy, 0.0%ni, 95.9%id, 1.8%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu25 : 6.3%us, 2.0%sy, 0.0%ni, 81.9%id, 8.6%wa, 0.2%hi, 1.1%si, 0.0%st
Cpu26 : 2.3%us, 0.3%sy, 0.0%ni, 95.5%id, 1.9%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu27 : 2.4%us, 0.3%sy, 0.0%ni, 95.2%id, 2.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu28 : 2.3%us, 0.3%sy, 0.0%ni, 40.8%id, 56.6%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu29 : 2.4%us, 0.3%sy, 0.0%ni, 40.7%id, 56.6%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu30 : 2.8%us, 0.4%sy, 0.0%ni, 94.0%id, 2.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu31 : 2.8%us, 0.5%sy, 0.0%ni, 40.0%id, 56.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu32 : 2.8%us, 0.4%sy, 0.0%ni, 94.5%id, 2.3%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu33 : 5.7%us, 1.2%sy, 0.0%ni, 36.5%id, 56.0%wa, 0.1%hi, 0.5%si, 0.0%st
Cpu34 : 9.5%us, 2.7%sy, 0.0%ni, 29.7%id, 56.3%wa, 0.2%hi, 1.5%si, 0.0%st
Cpu35 : 6.2%us, 0.8%sy, 0.0%ni, 38.0%id, 54.3%wa, 0.1%hi, 0.6%si, 0.0%st
Cpu36 : 6.2%us, 0.7%sy, 0.0%ni, 88.9%id, 2.8%wa, 0.2%hi, 1.2%si, 0.0%st
Cpu37 : 6.5%us, 0.7%sy, 0.0%ni, 88.1%id, 2.8%wa, 0.2%hi, 1.7%si, 0.0%st
Cpu38 : 6.1%us, 0.7%sy, 0.0%ni, 88.6%id, 2.7%wa, 0.2%hi, 1.8%si, 0.0%st
Cpu39 : 4.5%us, 0.5%sy, 0.0%ni, 90.4%id, 2.6%wa, 0.1%hi, 1.9%si, 0.0%st
Mem: 132028164k total, 106316600k used, 25711564k free, 20927060k buffers
Swap: 33554424k total, 1348108k used, 32206316k free, 72105920k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11036 rabbitmq 25 0 887m 114m 2484 S 46.3 0.1 334800:00 beam.smp
10:00:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
10:10:01 AM 0 1194 22.64 22.71 22.75
10:20:01 AM 0 1215 22.42 22.54 22.65
10:30:01 AM 1 1203 22.91 22.73 22.67
Average: 2 1220 24.49 24.44 24.73
Regards,
Nicolas