Home
Projects
Software
Bio
Books
Papers
Committees
Presentations

Effect of compiler options and coding techniques

This example looks at the performance of a simple vector add operation,
    for (i=0; i<n; i++) {
	a[i] += b[i];
    }
with various levels of compiler optimization and with different arrangements of the code, including the use of pragmas and pseudo-functions.

Code and results
CodeCode with unroll pragmaDriverResults
s1.c s2.c s1driver.c file

The general recommendations based on these results are:

  • C code must use pragma disjoint wherever possible
  • pragma unroll can boost performance
  • The __alignx pseudo function helps sometimes but hurts other times. In particular, in this example, it decreased performance when pragma unroll was used.
  • const did not help (or hurt) when pragma disjoint was used.
  • Using pragma unroll with -O3 gave performance that was almost as good as -O5, suggesting that codes that cannot be compiled correctly at -O5 may benefit from pragma unroll.

Interestingly enough, on this example no explicit code tranformations were necessary; the use of pragmas were sufficient. The winning version was (with -O5) was

void add2( double  *a, double *b, int n )
{
#pragma disjoint (*a,*b)
    int i;
#pragma unroll(10)
    for (i=0; i<n; i++) {
	a[i] += b[i];
    }
}
MCS Division Argonne National Laboratory University of Chicago