- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

Porting to SX-9: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
No edit summary
No edit summary
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
== general approach ==
Please do not compile on SX node itself, but use crosscompilers on frontends,
and use the crosscompilers on ontake and yari, not those versions on asama/a1.
=== simple case: you have a makefile ===
insert name of proper cross-compilation tools into your makefile.
The crosstools habe the prefix ''sx''.
Namely there is:
;sxf90
: compiler for Fortran90/Fortran95 with some Fortran2004 features
;sxcc
: compiler for C89 and C99 (use ''sxcc -Kc99'' for C99)
;sxc++
: compiler for C++ (ISO standard compliant)
;sxmpif90/sxmpicc/sxmpic++
: wrappers for MPI compilation, they link the MPI libraries without extra options (and do it right)
;sxmake
: make command as on SX, can deal with dependencies referencing objects within libraries
;sxar
: tool to create SX libraries (that one is often forgotten when editing makefiles!)
;sxnm
: tool to display symbol names of SX objects
;sxsize
: tool to display sizes of objects
;sxcpp
: preprocessor if called explicit
;sxld
: linker
;sxstrip
: tool to strip symbold from objects
;sxas
: assembler
;sxftrace
: tool to format ftrace performance traces
As frontends habe multiple cores, parallel compilation using
make -j 4
or
sxmake -P
can be used to speed up compilation process (compiling on frontends
is a lot faster than on SX node)
== complex case: configure based setup ==
cross-configure is tricky and does not always work,
it is best to run configure on the interactive node v900.
Set compiler variables to use sx-prefixed versions (there exists
some links), so try e.g.
./configure CC=sxcc F90=sxf90 FC=sxf90
The generated makefile should be usable on the frontends to compile the application
(may be ''ar'' has to be replaced by ''sxar'')
== tips and traps ==


This is a collection of '''pitfalls''' people use to get trapped when porting to SX.
This is a collection of '''pitfalls''' people use to get trapped when porting to SX.
Line 4: Line 70:
* Code dumps core after C malloc() is used. Make sure you include stdlib.h when using malloc(), otherwise, return value is assumed to be int (C standard), and the SX calling conventions strips significant bits from the address. Make sure outcome of malloc() is never assigned to int, SX is LP64, not ILP64, a pointer does not fit into an int.
* Code dumps core after C malloc() is used. Make sure you include stdlib.h when using malloc(), otherwise, return value is assumed to be int (C standard), and the SX calling conventions strips significant bits from the address. Make sure outcome of malloc() is never assigned to int, SX is LP64, not ILP64, a pointer does not fit into an int.


* Only 2GB of memory can be allocated with malloc(). For historical reasons, size_t is 32bit, as well as sizeof(*void) is 64bit. Use -size_t64 for C or Fortran compilers to get rid of that restriction.
* Only 2GB of memory can be allocated with malloc(). For historical reasons, size_t is 32bit, as well as sizeof(*void) is 64bit. Use '''-size_t64''' for C or Fortran compilers to get rid of that restriction.


* My C code seems to have a problem with integer divisions. SX uses 56bit precision division for 64bit integer types per default. Use -xint switch to get full 64bit division (and loose some performance).
* My C code seems to have a problem with integer divisions. SX uses 56bit precision division for 64bit integer types per default. Use '''-xint''' switch to get full 64bit division (and loose some performance).


* Code stops with Loop count is greater than that assumed by the compiler: loop-count=n Compiler has sometimes to make a guess about loop length, to be able to allocate some work vectors (only for partially vectorized loops). This educated guess might be wrong. Help the compiler by giving -W,-pvctl,noassume,loopcnt=n where n is the maximum loop count, or a larger value.
* Code stops with ''Loop count is greater than that assumed by the compiler: loop-count=n'' Compiler has sometimes to make a guess about loop length, to be able to allocate some work vectors (only for partially vectorized loops). This educated guess might be wrong. Help the compiler by giving ''-W,-pvctl,noassume,loopcnt=n'' where n is the maximum loop count, or a larger value, or use ''-W,pvctl=vwork=stack'' to allocate this work vectors on stack at runtime.


* How are Fortran function names in the calling convention? Simply write them lowercase and append an underscore. Example:
* How are Fortran function names in the calling convention? Simply write them lowercase and append an underscore. Example:

Latest revision as of 16:00, 28 November 2008

general approach

Please do not compile on SX node itself, but use crosscompilers on frontends, and use the crosscompilers on ontake and yari, not those versions on asama/a1.

simple case: you have a makefile

insert name of proper cross-compilation tools into your makefile. The crosstools habe the prefix sx.

Namely there is:

sxf90
compiler for Fortran90/Fortran95 with some Fortran2004 features
sxcc
compiler for C89 and C99 (use sxcc -Kc99 for C99)
sxc++
compiler for C++ (ISO standard compliant)
sxmpif90/sxmpicc/sxmpic++
wrappers for MPI compilation, they link the MPI libraries without extra options (and do it right)
sxmake
make command as on SX, can deal with dependencies referencing objects within libraries
sxar
tool to create SX libraries (that one is often forgotten when editing makefiles!)
sxnm
tool to display symbol names of SX objects
sxsize
tool to display sizes of objects
sxcpp
preprocessor if called explicit
sxld
linker
sxstrip
tool to strip symbold from objects
sxas
assembler
sxftrace
tool to format ftrace performance traces

As frontends habe multiple cores, parallel compilation using

make -j 4

or

sxmake -P

can be used to speed up compilation process (compiling on frontends is a lot faster than on SX node)

complex case: configure based setup

cross-configure is tricky and does not always work, it is best to run configure on the interactive node v900.

Set compiler variables to use sx-prefixed versions (there exists some links), so try e.g.

./configure CC=sxcc F90=sxf90 FC=sxf90 

The generated makefile should be usable on the frontends to compile the application (may be ar has to be replaced by sxar)


tips and traps

This is a collection of pitfalls people use to get trapped when porting to SX.

  • Code dumps core after C malloc() is used. Make sure you include stdlib.h when using malloc(), otherwise, return value is assumed to be int (C standard), and the SX calling conventions strips significant bits from the address. Make sure outcome of malloc() is never assigned to int, SX is LP64, not ILP64, a pointer does not fit into an int.
  • Only 2GB of memory can be allocated with malloc(). For historical reasons, size_t is 32bit, as well as sizeof(*void) is 64bit. Use -size_t64 for C or Fortran compilers to get rid of that restriction.
  • My C code seems to have a problem with integer divisions. SX uses 56bit precision division for 64bit integer types per default. Use -xint switch to get full 64bit division (and loose some performance).
  • Code stops with Loop count is greater than that assumed by the compiler: loop-count=n Compiler has sometimes to make a guess about loop length, to be able to allocate some work vectors (only for partially vectorized loops). This educated guess might be wrong. Help the compiler by giving -W,-pvctl,noassume,loopcnt=n where n is the maximum loop count, or a larger value, or use -W,pvctl=vwork=stack to allocate this work vectors on stack at runtime.
  • How are Fortran function names in the calling convention? Simply write them lowercase and append an underscore. Example:
     		PROGRAM TEST
     		CALL CFUNC(5)
     		END
     		void cfunc_(int *a) {
     		}
  • What about parameters? Fortran is passing by reference, so use pointers in C for scalars. Character strings lead to an additional parameter of type long at the end of the parameter list, containing the length of the string. See C compilers manual chapter 4 for details.
  • How to link when C and Fortran are mixed? Link with f90. He makes it right. When using C++ or using C++/SX as C compiler, use C++/SX as linker, using the option -f90lib.