There is an environment based on ESX 5.1 , which is supported with Oracle Linux 6.6 64 bit.
So, Oracle Linux 6.6 64 bit runs on a guest virtual machine , which reside on ESX 5.1.
Guest VM has 2 cpu sockets and 4 cores in each cpu socket = Total 8 Cpu cores
The configuration is like below;
socket 1 = cpu0,cpu1,cpu2,cpu3 socket2= cpu4,cpu5,cpu6,cpu7
So far so good.
The problem is the cpu utilization.
When using 3.8.13-98.1.1.el6uek.x86_64, Oracle Linux 6.6 uses only 4 cpus. No matter what.. I have analyzed cpu utilization properly and it does not allocate the last 4 cpu cores.
Oracle Linux 6.6 sees 8 cpus.
If I use the taskset executable to force a process to run on a specific cpu core which ORacle Linux normally does not utilize, the process starts running on that cpu without any problems and we can see that cpu utilization of that cpu core becomes %100, as expected.
[root@somehost opt]# taskset -c -p 6 2313
pid 2313's current affinity list: 0-7
pid 2313's new affinity list: 6
[root@somehost~]# top
top - 19:10:14 up 4 days, 6:02, 6 users, load average: 1.06, 0.63, 0.32
Tasks: 432 total, 2 running, 430 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.3%us, 0.7%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 0.7%us, 0.7%sy, 0.0%ni, 98.0%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 0.3%us, 0.7%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.7%us, 0.3%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 99.7%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Mem: 32687204k total, 32384496k used, 302708k free, 85320k buffers
Swap: 33554428k total, 33320k used, 33521108k free, 21913992k cached
So, when forced, Oracle Linux 6.6 with 3.8.13-98.1.1.el6uek.x86_64 kernel can use all the cores, but normally the scheduler automatically does not utilize the 4 cores coming from the second cpu socket, even under a very loaded situation as seen below. (cpu4,5,6,7 is not used.. not utilized..)
op - 12:51:32 up 3 days, 23:43, 3 users, load average: 16.74, 9.82, 5.30
Tasks: 454 total, 18 running, 436 sleeping, 0 stopped, 0 zombie
Cpu0 : 92.2%us, 6.6%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 1.2%si, 0.0%st
Cpu1 : 94.2%us, 4.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 1.1%si, 0.0%st
Cpu2 : 93.4%us, 4.8%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 1.8%si, 0.0%st
Cpu3 : 92.7%us, 5.4%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 1.9%si, 0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 32687204k total, 32521340k used, 165864k free, 104284k buffers
Swap: 33554428k total, 32992k used, 33521436k free, 21020212k cached
The strange thing is , the issue can not be reproduced in 3.8.13-44 el6uek kernel.
When booted with 3.8.13-44 el6uek kernel, Oracle Linux 6.6 sees and utilizes all the cpu cores without any problems, perfectly in balance.
So, the problem basically is "Oracle Linux 6.6 with 3.8.13-98.1.1.el6uek.x86_64 kernel only uses the cpu cores that comes with the first cpu socket.."
What can be the reason here? Any scheduler configuration? Any thing that may limit cpu core or socket usages?