- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -
Useful Compiler Options On x86: Difference between revisions
From HLRS Platforms
Jump to navigationJump to search
(Added cray asm options) |
m (fixed formatting on assembler notes) |
||
(7 intermediate revisions by the same user not shown) | |||
Line 11: | Line 11: | ||
! ICC Option | ! ICC Option | ||
! PGCC Option | ! PGCC Option | ||
! CRAYCC Option | |||
|- | |- | ||
| Enable aggressive optimization features in general | | Enable aggressive optimization features in general | ||
| -O3 | | -O3 | ||
| - | | -O3 | ||
| - | | -O3 | ||
| -O3 | |||
|- | |- | ||
| Enable relaxed floating pointing builtin functions | | Enable relaxed floating pointing builtin functions | ||
Line 21: | Line 23: | ||
| -fp-model fast | | -fp-model fast | ||
| -Mfprelaxed | | -Mfprelaxed | ||
| | |||
|- | |- | ||
| Round demormalized FP values to zero | | Round demormalized FP values to zero | ||
Line 26: | Line 29: | ||
| -ftz | | -ftz | ||
| -Mdaz | | -Mdaz | ||
| | |||
|- | |- | ||
| Assume associativity of floating point operations | | Assume associativity of floating point operations | ||
Line 31: | Line 35: | ||
| | | | ||
| –Mvect=assoc | | –Mvect=assoc | ||
| | |||
|- | |- | ||
| Assume C ansi aliasing rules are not violated | | Assume C ansi aliasing rules are not violated | ||
| -fstrict-aliasing (included in -O2 and -O3) | | -fstrict-aliasing (included in -O2 and -O3) | ||
| -ansi-alias | | -ansi-alias | ||
| | |||
| | | | ||
|- | |- | ||
| Enable | | Enable optimization over file boundaries | ||
| -fwhole-program | | -fwhole-program -combine | ||
| -ipo (included -fast) | | | ||
| | |||
| | |||
|- | |||
| Enable optimization over file boundaries at link time | |||
| -flto | |||
| -ipo (included in -fast) | |||
| -Mipa=fast or -Mipa=fast,inline | | -Mipa=fast or -Mipa=fast,inline | ||
| -hipa5 -hwp -hpl=<tempdir> | |||
|} | |} | ||
==== Notes ==== | ==== Notes ==== | ||
Line 53: | Line 66: | ||
! ICC Option | ! ICC Option | ||
! PGCC Option | ! PGCC Option | ||
! CrayCC Option | |||
|- | |- | ||
| compile C code as C99 | | compile C code as C99 | ||
Line 58: | Line 72: | ||
| -std=c99 | | -std=c99 | ||
| -c99 | | -c99 | ||
| on by default | |||
|- | |- | ||
| enable OpenMP | | enable OpenMP | ||
| -fopenmp | | -fopenmp | ||
| -openmp | | -openmp | ||
| -mp | |||
| on by default | | on by default | ||
|- | |- | ||
Line 68: | Line 84: | ||
| not supported | | not supported | ||
| -ta=host -ta=nvidia or -ta=cc13 | | -ta=host -ta=nvidia or -ta=cc13 | ||
| not supported | |||
|} | |} | ||
Line 98: | Line 115: | ||
! ICC Option | ! ICC Option | ||
! PGCC Option | ! PGCC Option | ||
! CrayCC Option | |||
|- | |- | ||
| generate auto-vectorization information | | generate auto-vectorization information | ||
Line 103: | Line 121: | ||
| -vec-report=<level> | | -vec-report=<level> | ||
| -Minfo=vec | | -Minfo=vec | ||
| -h list=m | |||
|} | |} | ||
==== Notes ==== | ==== Notes ==== | ||
* the PGI compiler has several other -Minfo=<type> switches to produce diagnostic output on inlining, unrolling, IPO, auto-parallelization and accelerator usage | * the PGI compiler has several other -Minfo=<type> switches to produce diagnostic output on inlining, unrolling, IPO, auto-parallelization and accelerator usage | ||
* the Cray compiler has several other -h list=<type> switches to produce diagnostic output on optimization | |||
=== Assembler Output === | === Assembler Output === | ||
Line 129: | Line 149: | ||
(not asm but intermediate representation) | (not asm but intermediate representation) | ||
|} | |} | ||
==== Notes ==== | |||
*Assembler code of object files or executables can be shown with: | |||
::<tt>objdump -d <objfile></tt> | |||
*Source-annotated assembler code of object files or executables can be shown with: | |||
::<tt>objdump -dS <objfile></tt> |
Latest revision as of 10:44, 5 August 2011
Introduction
The following options are helpful to improve performance and debug performance problems when compiling code on x86 processors using GCC, the Intel or PGI C or Fortran compilers.
Optimization Switches
Description | GCC Option | ICC Option | PGCC Option | CRAYCC Option |
---|---|---|---|---|
Enable aggressive optimization features in general | -O3 | -O3 | -O3 | -O3 |
Enable relaxed floating pointing builtin functions | -ffast-math | -fp-model fast | -Mfprelaxed | |
Round demormalized FP values to zero | included in -ffast-math | -ftz | -Mdaz | |
Assume associativity of floating point operations | -fno-signed-zeros -fno-trapping-math -fassociative-math | –Mvect=assoc | ||
Assume C ansi aliasing rules are not violated | -fstrict-aliasing (included in -O2 and -O3) | -ansi-alias | ||
Enable optimization over file boundaries | -fwhole-program -combine | |||
Enable optimization over file boundaries at link time | -flto | -ipo (included in -fast) | -Mipa=fast or -Mipa=fast,inline | -hipa5 -hwp -hpl=<tempdir> |
Notes
- using -ansi-alias often helps the auto-vectorize code with the intel compiler, especially when OpenMP is enabled
Input Language
Description | GCC Option | ICC Option | PGCC Option | CrayCC Option |
---|---|---|---|---|
compile C code as C99 | -std=c99 or -std=gnu99 | -std=c99 | -c99 | on by default |
enable OpenMP | -fopenmp | -openmp | -mp | on by default |
enable accelerator directives | not supported | not supported | -ta=host -ta=nvidia or -ta=cc13 | not supported |
Target Architecture
Description | GCC Option | ICC Option | PGCC Option |
---|---|---|---|
select target processor | -march=<processor name> | -x <processor name> | -tp <processor name> |
enable automatic SMP parallelization | not supported | not supported | -Mconcur |
Notes
- the list of recognized processors is different for each compiler, very long and continuously getting longer. See the manuals of the compilers for those lists.
Diagnostic Output
Description | GCC Option | ICC Option | PGCC Option | CrayCC Option |
---|---|---|---|---|
generate auto-vectorization information | -ftree-vectorizer-verbose=<level> | -vec-report=<level> | -Minfo=vec | -h list=m |
Notes
- the PGI compiler has several other -Minfo=<type> switches to produce diagnostic output on inlining, unrolling, IPO, auto-parallelization and accelerator usage
- the Cray compiler has several other -h list=<type> switches to produce diagnostic output on optimization
Assembler Output
Description | GCC Option | ICC Option | PGCC Option | CrayCC Option |
---|---|---|---|---|
generate assembler output instead of object file | -S | -S | -S | -S |
generate assembler output with source code included in asm file | -g -Wa,-a,-ad | -S -fsource-asm | -S -Manno | -h list=d
(not asm but intermediate representation) |
Notes
- Assembler code of object files or executables can be shown with:
- objdump -d <objfile>
- Source-annotated assembler code of object files or executables can be shown with:
- objdump -dS <objfile>