Useful Compiler Options On x86: Difference between revisions

Revision as of 10:58, 18 May 2011

Introduction

The following options are helpful to improve performance and debug performance problems when compiling code on x86 processors using GCC, the Intel or PGI C or Fortran compilers.

Optimization Switches

Description	GCC Option	ICC Option	PGCC Option
Enable aggressive optimization features in general	-O3	-fast	-fast
Enable relaxed floating pointing builtin functions	-ffast-math	-fp-model fast	-Mfprelaxed
Round demormalized FP values to zero	included in -ffast-math	-ftz	-Mdaz
Assume associativity of floating point operations	-fno-signed-zeros -fno-trapping-math -fassociative-math		–Mvect=assoc
Assume C ansi aliasing rules are not violated	-fstrict-aliasing (included in -O2 and -O3)	-ansi-alias
Enable interprocedural optimization over file boundaries	-fwhole-program -combine	-ipo (included -fast)	-Mipa=fast or -Mipa=fast,inline

Notes

using -ansi-alias often helps the auto-vectorize code with the intel compiler, especially when OpenMP is enabled

Input Language

Description	GCC Option	ICC Option	PGCC Option	CrayCC Option
compile C code as C99	-std=c99 or -std=gnu99	-std=c99	-c99	on by default
enable OpenMP	-fopenmp	-openmp	-mp	on by default
enable accelerator directives	not supported	not supported	-ta=host -ta=nvidia or -ta=cc13	not supported

Target Architecture

Description	GCC Option	ICC Option	PGCC Option
select target processor	-march=<processor name>	-x <processor name>	-tp <processor name>
enable automatic SMP parallelization	not supported	not supported	-Mconcur

Notes

the list of recognized processors is different for each compiler, very long and continuously getting longer. See the manuals of the compilers for those lists.

Diagnostic Output

Description	GCC Option	ICC Option	PGCC Option	CrayCC Option
generate auto-vectorization information	-ftree-vectorizer-verbose=<level>	-vec-report=<level>	-Minfo=vec	-h list=m

Notes

the PGI compiler has several other -Minfo=<type> switches to produce diagnostic output on inlining, unrolling, IPO, auto-parallelization and accelerator usage
the Cray compiler has several other -h list=<type> switches to produce diagnostic output on optimization

Assembler Output

Description	GCC Option	ICC Option	PGCC Option	CrayCC Option
generate assembler output instead of object file	-S	-S	-S	-S
generate assembler output with source code included in asm file	-g -Wa,-a,-ad	-S -fsource-asm	-S -Manno	-h list=d (not asm but intermediate representation)

@@ Line 38: / Line 38: @@
 |-
 | Enable interprocedural optimization over file boundaries
-| -fwhole-program
+| -fwhole-program -combine
 | -ipo (included -fast)
 | -Mipa=fast or -Mipa=fast,inline

Useful Compiler Options On x86: Difference between revisions

Revision as of 10:58, 18 May 2011

Contents

Introduction

Optimization Switches

Notes

Input Language

Target Architecture

Notes

Diagnostic Output

Notes

Assembler Output

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools