Skip to Main Content

Infrastructure Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Why is kernel-2.6.9 (OEL-4) faster than kernel-2.6.18 (OEL-5) ?

511303Sep 16 2007 — edited Sep 20 2007

Hi,

as long as RHEL-5 and then OEL-5 have been released, I have been wondering why my own programs, compiled and run on RHEL-5/OEL-5, are slower than the same programs compiled and run on RHEL-4/OEL-4 on the same machine. This is really barmy since gcc-4.1, shipped with RHEL-5/OEL-5, is very aggressive compiler and produces faster binary code than gcc-3.4.6, shipped with RHEL-4/OEL-4. I verified this hundred times testing both compilers on RHEL-4/OEL-4 and RHEL-5/OEL-5. The 4.1 compiler always produces faster executable on the same OS.

The problem is obviously in kernel-2.6.18. There is something in the kernel (maybe scheduler?) that slows down the execution of programs. But what? I experimented with changing various kernel boot parameters (eg "acpi=off" etc), even tried to recompile the kernel many times with various combinations of config parameters, and nothing helps. Thus, I'm still wondering whether the problem is solvable by disabling one or more config parameters and recompiling the kernel, or is deeply embedded in the main kernel code.

Is there anybody in this forum who experienced the same, say running OEL-4 before migrating to OEL-5?

Here are two examples showing different execution times on OEL-4.5 (kernel-2.6.9-55.0.5.0.1.EL.i686, gcc-3.4.6-8.0.1) and OEL-5 (kernel-2.6.18-8.1.10.0.1.el5, gcc-4.1.1-52.el5.2). The first example is trivial but very sensitive to overal system load and kernel version. The second example is "Sieve of Eratosthenes" - the program for finding prime numbers (CPU bound).

EXAMPLE 1.

/*-----------------------------------------*/
/*  Simle program for text screen console  */
/*  very sensitive to overall system load  */
/*  and kernel version                     */
/*-----------------------------------------*/

#include <stdio.h>

int main(void)
{
    register int i;

    for(i = 0; i < 1000000; i++)
	printf(" %d ", i);
	
    return 0;
}
/* end of program */

$ gcc -O2 -o example1 -s example1.c
$ time ./example1

The average execution times on OEL-4.5 and OEL-5 are as follow:

----------------------------------
Mode      OEL-4.5         OEL-5
----------------------------------
real      0m3.141s        0m4.931s
user      0m0.394s        0m0.366s
sys       0m2.747s        0m4.563s 
----------------------------------

As we can see, the program on the same machine, compiled and run on OEL-4.5 (gcc-3.4.6 and kernel-2.6.9) is 57% faster than the same program compiled and run on OEL-5 (gcc-4.1.1 and kernel-2.6.18), although gcc-4.1.1 produces much faster binary code. Since the times the process spent in user mode are almost equal on both OS, the whole difference is due to the time the process spent in kernel mode. Note that kernel mode (sys) is taking 66% more time on OEL-5. It tells me that "something" in the kernel-2.6.18 slows down the execution of the program.

In the second example OEL-4.5 is also faster than OEL-5, but the differences in execution times are not so drastic as in the first example.

EXAMPLE 2.

/*-------------------------------------------*/
/*           Sieve of Eratosthenes           */
/*-------------------------------------------*/

#define GNUSOURCE

#include <stdio.h>
#include <stdlib.h>

#define MAX_PRIME_AREA 100000
#define REPEAT_LOOP 10000

int main(void)
{
    int prime, composite, count;
    char *sieve_array;

    if ((sieve_array = (char *) malloc( (size_t) (MAX_PRIME_AREA + 1))) == NULL)
    {
	fprintf(stderr,"Memory block too big!\nMemory allocation failed!\a\n");
	exit(EXIT_FAILURE);
    }
    
    for(count = 0; count < REPEAT_LOOP; count++)
    { 
	for(prime = 0; prime < (MAX_PRIME_AREA + 1); prime++)
    	    *(sieve_array + prime) = (char) '\0';
    	
	for(prime = 3; prime < (MAX_PRIME_AREA + 1); prime += 2)
	{
	    if (! *(sieve_array + prime) )
	    {
		*(sieve_array + prime) = (char) 'P';  /* offset prime is a prime */
	        for(composite = (2 * prime); composite < (MAX_PRIME_AREA + 1); composite += prime)
		    *(sieve_array + composite) = (char) 'X';  /* offset composite is a composite */
	    }
	}
        /* DO NOT COMPILE FOR TEST !!!
        fprintf(stdout, "\n%d\n", 2);
        for(prime = 3; prime < (MAX_PRIME_AREA + 1); prime += 2)
            if ( *(sieve_array + prime) == 'P' )
                fprintf(stdout, "%d\n", prime);
        */
    }

    free(sieve_array);	
    return 0;
}
/* End of Sieve of Eratosthenes */

The average execution times on the same machine on OEL-4.5 and OEL-5 are:

---------------------------------------------------------
MAX_PRIME_AREA     Mode         OEL-4.5         OEL-5      
---------------------------------------------------------
                   real         0m9.196s        0m10.531s
   100000          user         0m9.189s        0m10.478s 
                   sys          0m0.002s        0m0.010s
--------------------------------------------------------- 
                   real         0m20.264s       0m21.532s
   200000          user         0m20.233s       0m21.490s
                   sys          0m0.020s        0m0.025s
---------------------------------------------------------
                   real         0m30.722s       0m33.502s
   300000          user         0m30.684s       0m33.456s  
                   sys          0m0.024s        0m0.032s
---------------------------------------------------------
                   real         1m10.163s       1m15.215s
   400000          user         1m10.087s       1m14.704s
                   sys          0m0.075s        0m0.079s 
---------------------------------------------------------

Does this ring a bell with anyone? Any clue why?

N.J.

Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Oct 18 2007
Added on Sep 16 2007
14 comments
2,405 views