- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -
Useful Compiler Options On x86: Difference between revisions
From HLRS Platforms
Jump to navigationJump to search
(added -combine for gcc interprocedural optimization) |
(updated optimization switches) |
||
Line 11: | Line 11: | ||
! ICC Option | ! ICC Option | ||
! PGCC Option | ! PGCC Option | ||
! CRAYCC Option | |||
|- | |- | ||
| Enable aggressive optimization features in general | | Enable aggressive optimization features in general | ||
| -O3 | | -O3 | ||
| - | | -O3 | ||
| - | | -O3 | ||
| -O3 | |||
|- | |- | ||
| Enable relaxed floating pointing builtin functions | | Enable relaxed floating pointing builtin functions | ||
Line 21: | Line 23: | ||
| -fp-model fast | | -fp-model fast | ||
| -Mfprelaxed | | -Mfprelaxed | ||
| | |||
|- | |- | ||
| Round demormalized FP values to zero | | Round demormalized FP values to zero | ||
Line 26: | Line 29: | ||
| -ftz | | -ftz | ||
| -Mdaz | | -Mdaz | ||
| | |||
|- | |- | ||
| Assume associativity of floating point operations | | Assume associativity of floating point operations | ||
Line 31: | Line 35: | ||
| | | | ||
| –Mvect=assoc | | –Mvect=assoc | ||
| | |||
|- | |- | ||
| Assume C ansi aliasing rules are not violated | | Assume C ansi aliasing rules are not violated | ||
| -fstrict-aliasing (included in -O2 and -O3) | | -fstrict-aliasing (included in -O2 and -O3) | ||
| -ansi-alias | | -ansi-alias | ||
| | |||
| | | | ||
|- | |- | ||
| Enable | | Enable optimization over file boundaries | ||
| -fwhole-program -combine | | -fwhole-program -combine | ||
| -ipo (included -fast) | | | ||
| | |||
| | |||
|- | |||
| Enable optimization over file boundaries at link time | |||
| -flto | |||
| -ipo (included in -fast) | |||
| -Mipa=fast or -Mipa=fast,inline | | -Mipa=fast or -Mipa=fast,inline | ||
| -hipa5 -hwp -hpl=<tempdir> | |||
|} | |} | ||
==== Notes ==== | ==== Notes ==== |
Revision as of 10:33, 5 August 2011
Introduction
The following options are helpful to improve performance and debug performance problems when compiling code on x86 processors using GCC, the Intel or PGI C or Fortran compilers.
Optimization Switches
Description | GCC Option | ICC Option | PGCC Option | CRAYCC Option |
---|---|---|---|---|
Enable aggressive optimization features in general | -O3 | -O3 | -O3 | -O3 |
Enable relaxed floating pointing builtin functions | -ffast-math | -fp-model fast | -Mfprelaxed | |
Round demormalized FP values to zero | included in -ffast-math | -ftz | -Mdaz | |
Assume associativity of floating point operations | -fno-signed-zeros -fno-trapping-math -fassociative-math | –Mvect=assoc | ||
Assume C ansi aliasing rules are not violated | -fstrict-aliasing (included in -O2 and -O3) | -ansi-alias | ||
Enable optimization over file boundaries | -fwhole-program -combine | |||
Enable optimization over file boundaries at link time | -flto | -ipo (included in -fast) | -Mipa=fast or -Mipa=fast,inline | -hipa5 -hwp -hpl=<tempdir> |
Notes
- using -ansi-alias often helps the auto-vectorize code with the intel compiler, especially when OpenMP is enabled
Input Language
Description | GCC Option | ICC Option | PGCC Option | CrayCC Option |
---|---|---|---|---|
compile C code as C99 | -std=c99 or -std=gnu99 | -std=c99 | -c99 | on by default |
enable OpenMP | -fopenmp | -openmp | -mp | on by default |
enable accelerator directives | not supported | not supported | -ta=host -ta=nvidia or -ta=cc13 | not supported |
Target Architecture
Description | GCC Option | ICC Option | PGCC Option |
---|---|---|---|
select target processor | -march=<processor name> | -x <processor name> | -tp <processor name> |
enable automatic SMP parallelization | not supported | not supported | -Mconcur |
Notes
- the list of recognized processors is different for each compiler, very long and continuously getting longer. See the manuals of the compilers for those lists.
Diagnostic Output
Description | GCC Option | ICC Option | PGCC Option | CrayCC Option |
---|---|---|---|---|
generate auto-vectorization information | -ftree-vectorizer-verbose=<level> | -vec-report=<level> | -Minfo=vec | -h list=m |
Notes
- the PGI compiler has several other -Minfo=<type> switches to produce diagnostic output on inlining, unrolling, IPO, auto-parallelization and accelerator usage
- the Cray compiler has several other -h list=<type> switches to produce diagnostic output on optimization
Assembler Output
Description | GCC Option | ICC Option | PGCC Option | CrayCC Option |
---|---|---|---|---|
generate assembler output instead of object file | -S | -S | -S | -S |
generate assembler output with source code included in asm file | -g -Wa,-a,-ad | -S -fsource-asm | -S -Manno | -h list=d
(not asm but intermediate representation) |