- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -
Useful Compiler Options On x86
From HLRS Platforms
Jump to navigationJump to search
Introduction
The following options are helpful to improve performance and debug performance problems when compiling code on x86 processors using GCC, the Intel or PGI C or Fortran compilers.
Optimization Switches
Description | GCC Option | ICC Option | PGCC Option | CRAYCC Option |
---|---|---|---|---|
Enable aggressive optimization features in general | -O3 | -O3 | -O3 | -O3 |
Enable relaxed floating pointing builtin functions | -ffast-math | -fp-model fast | -Mfprelaxed | |
Round demormalized FP values to zero | included in -ffast-math | -ftz | -Mdaz | |
Assume associativity of floating point operations | -fno-signed-zeros -fno-trapping-math -fassociative-math | –Mvect=assoc | ||
Assume C ansi aliasing rules are not violated | -fstrict-aliasing (included in -O2 and -O3) | -ansi-alias | ||
Enable optimization over file boundaries | -fwhole-program -combine | |||
Enable optimization over file boundaries at link time | -flto | -ipo (included in -fast) | -Mipa=fast or -Mipa=fast,inline | -hipa5 -hwp -hpl=<tempdir> |
Notes
- using -ansi-alias often helps the auto-vectorize code with the intel compiler, especially when OpenMP is enabled
Input Language
Description | GCC Option | ICC Option | PGCC Option | CrayCC Option |
---|---|---|---|---|
compile C code as C99 | -std=c99 or -std=gnu99 | -std=c99 | -c99 | on by default |
enable OpenMP | -fopenmp | -openmp | -mp | on by default |
enable accelerator directives | not supported | not supported | -ta=host -ta=nvidia or -ta=cc13 | not supported |
Target Architecture
Description | GCC Option | ICC Option | PGCC Option |
---|---|---|---|
select target processor | -march=<processor name> | -x <processor name> | -tp <processor name> |
enable automatic SMP parallelization | not supported | not supported | -Mconcur |
Notes
- the list of recognized processors is different for each compiler, very long and continuously getting longer. See the manuals of the compilers for those lists.
Diagnostic Output
Description | GCC Option | ICC Option | PGCC Option | CrayCC Option |
---|---|---|---|---|
generate auto-vectorization information | -ftree-vectorizer-verbose=<level> | -vec-report=<level> | -Minfo=vec | -h list=m |
Notes
- the PGI compiler has several other -Minfo=<type> switches to produce diagnostic output on inlining, unrolling, IPO, auto-parallelization and accelerator usage
- the Cray compiler has several other -h list=<type> switches to produce diagnostic output on optimization
Assembler Output
Description | GCC Option | ICC Option | PGCC Option | CrayCC Option |
---|---|---|---|---|
generate assembler output instead of object file | -S | -S | -S | -S |
generate assembler output with source code included in asm file | -g -Wa,-a,-ad | -S -fsource-asm | -S -Manno | -h list=d
(not asm but intermediate representation) |
Notes
- Assembler code of object files or executables can be shown with:
- objdump -d <objfile>
- Source-annotated assembler code of object files or executables can be shown with:
- objdump -dS <objfile>