- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

Useful Compiler Options On x86: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
(Added cray asm options)
(added cray code type options)
Line 53: Line 53:
!  ICC Option
!  ICC Option
!  PGCC Option
!  PGCC Option
! CrayCC Option
|-
|-
| compile C code as C99
| compile C code as C99
Line 58: Line 59:
| -std=c99
| -std=c99
| -c99
| -c99
| on by default
|-
|-
| enable OpenMP
| enable OpenMP
| -fopenmp
| -fopenmp
| -openmp
| -openmp
| -mp
| on by default
| on by default
|-
|-
Line 68: Line 71:
| not supported
| not supported
| -ta=host -ta=nvidia or -ta=cc13
| -ta=host -ta=nvidia or -ta=cc13
| not supported
|}
|}



Revision as of 15:34, 16 February 2011

Introduction

The following options are helpful to improve performance and debug performance problems when compiling code on x86 processors using GCC, the Intel or PGI C or Fortran compilers.

Optimization Switches

Description GCC Option ICC Option PGCC Option
Enable aggressive optimization features in general -O3 -fast -fast
Enable relaxed floating pointing builtin functions -ffast-math -fp-model fast -Mfprelaxed
Round demormalized FP values to zero included in -ffast-math -ftz -Mdaz
Assume associativity of floating point operations -fno-signed-zeros -fno-trapping-math -fassociative-math –Mvect=assoc
Assume C ansi aliasing rules are not violated -fstrict-aliasing (included in -O2 and -O3) -ansi-alias
Enable interprocedural optimization over file boundaries -fwhole-program -ipo (included -fast) -Mipa=fast or -Mipa=fast,inline

Notes

  • using -ansi-alias often helps the auto-vectorize code with the intel compiler, especially when OpenMP is enabled

Input Language

Description GCC Option ICC Option PGCC Option CrayCC Option
compile C code as C99 -std=c99 or -std=gnu99 -std=c99 -c99 on by default
enable OpenMP -fopenmp -openmp -mp on by default
enable accelerator directives not supported not supported -ta=host -ta=nvidia or -ta=cc13 not supported

Target Architecture

Description GCC Option ICC Option PGCC Option
select target processor -march=<processor name> -x <processor name> -tp <processor name>
enable automatic SMP parallelization not supported not supported -Mconcur

Notes

  • the list of recognized processors is different for each compiler, very long and continuously getting longer. See the manuals of the compilers for those lists.

Diagnostic Output

Description GCC Option ICC Option PGCC Option
generate auto-vectorization information -ftree-vectorizer-verbose=<level> -vec-report=<level> -Minfo=vec

Notes

  • the PGI compiler has several other -Minfo=<type> switches to produce diagnostic output on inlining, unrolling, IPO, auto-parallelization and accelerator usage

Assembler Output

Description GCC Option ICC Option PGCC Option CrayCC Option
generate assembler output instead of object file -S -S -S -S
generate assembler output with source code included in asm file -g -Wa,-a,-ad -S -fsource-asm -S -Manno -h list=d

(not asm but intermediate representation)