- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -
CRAY XE6 Programming Hints: Difference between revisions
No edit summary |
No edit summary |
||
(3 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== General Notice == | |||
<span style="color:red">Please first check</span> the general [[Programming How-To's, Tips & Tricks]] page as well for further further information (compiler options, MPI-IO optimization), which show benefits on Cray XE6 as well. | |||
On the login servers of the cray xe6 system, users get by default the setup for the highly optimizing programming environment provided by Cray: | |||
* Compilers | |||
* Efficient communications library | |||
* … | |||
All of this software components have been tested and proved to work on the system. If you encounter problems with this environment, please file a bug report using the [http://www.hlrs.de/organization/sos/hpcn/services/trouble-ticket-submission-form/ HLRS trouble ticket system] Different compiler/environment (e.g. Open MPI) should be handled as academic and could be used as a workaround. Please be aware you will get the most efficient support (and performance) by using the software provided by the vendor. | |||
Please be aware the AMD Interlagos processor architecture is brand new. This will cause frequent software updates. To get always the best performance users should recompile their programs | |||
= Programming Hints = | = Programming Hints = | ||
This page provides hints and howtos for common programming tasks and problems, that are ''specific'' to the Cray XE6. | This page provides hints and howtos for common programming tasks and problems, that are ''specific'' to the Cray XE6. |
Latest revision as of 14:21, 12 January 2012
General Notice
Please first check the general Programming How-To's, Tips & Tricks page as well for further further information (compiler options, MPI-IO optimization), which show benefits on Cray XE6 as well.
On the login servers of the cray xe6 system, users get by default the setup for the highly optimizing programming environment provided by Cray:
- Compilers
- Efficient communications library
- …
All of this software components have been tested and proved to work on the system. If you encounter problems with this environment, please file a bug report using the HLRS trouble ticket system Different compiler/environment (e.g. Open MPI) should be handled as academic and could be used as a workaround. Please be aware you will get the most efficient support (and performance) by using the software provided by the vendor.
Please be aware the AMD Interlagos processor architecture is brand new. This will cause frequent software updates. To get always the best performance users should recompile their programs
Programming Hints
This page provides hints and howtos for common programming tasks and problems, that are specific to the Cray XE6. For general programming information, we refer to our Wiki pages such as usage of Open MPI, or Best practices for MPI-IO.
Cray compiler
Cray compiler and inline assembler
The Cray compiler as of now (v. 7.4.4) does not support inline assembly with the __asm__ extension. Common code that fails to compile is the Intel/AMD Time Stamp Counter (rdtsc) as in the following code:
static inline unsigned long long getrdtsc(void) { unsigned long long x; #if defined (__i386__) __asm__ __volatile__ ("rdtsc" : "=A" (x)); #elif defined (__x86_64__) unsigned int tickl, tickh; __asm__ __volatile__ ("rdtsc" : "=a" (tickl), "=d" (tickh)); x = ((unsigned long long)tickh << 32) | tickl; #else #warning "Architecture not yet supported in ASM" #endif return x; }
You may easily handle this by compiling into assembler only the relevant functions / parts in gcc, and then compiling with those functions:
- Copy the function into its own file.
- Delete the static keyword of the function argument list.
- Compile using gcc into an assembler output file called rdtsc.S:
gcc -O2 -S rdtsc.c
- Eliminate gcc's markers around the __asm__ section named # APP and # NOAPP and eliminate all the sugar
.text .p2align 4,,15 .globl getrdtsc .type getrdtsc, @function getrdtsc: .LFB2: rdtsc movl %eax, %ecx movq %rdx, %rax salq $32, %rax mov %ecx, %edx orq %rdx, %rax ret .LFE2: .size getrdtsc, .-getrdtsc .section .eh_frame,"a",@progbits
- Finally compile the files calling the assembly function:
craycc main.c rdtsc.S
ALPS usage
Usually, the programmer does not need to get access to the ALPS launcher on Cray (MPI does everything necessary to set up the process. In order to write parallel applications, accessing uGNI, or in order to get information on the allocation (e.g to detect the number of threads that have been allocated in this qsub), one may access ALPS information, easily.
The main header files are libalps.h for ALPS access, libalpslli.h for the low-level interface and finally libalpsutil.h to get access to placement infos on the compute nodes (alps_get_placement_info).
The usage is quite easy: To request information (which requires a response), such as getting the application ID (apid), send a alps_app_lli_put_request, expect a response with alps_app_lli_get_response. If the status is OK (here only return value tested for), receive the actual message with alps_app_lli_get_response_bytes.
The following examples gets the application info (app_info which contains information like the GNI pTag and cookie):
ret = alps_app_lli_put_request(ALPS_APP_LLI_ALPS_REQ_APID, NULL, 0); if (ALPS_APP_LLI_ALPS_STAT_OK != ret) ERROR (ret, "alps_app_lli_put_simple_request()"); ret = alps_app_lli_get_response (&alps_status, &alps_count); if (ALPS_APP_LLI_ALPS_STAT_OK != ret) ERROR (ret, "alps_app_lli_get_response()"); ret = alps_app_lli_get_response_bytes (&apid, sizeof(apid)); if (ALPS_APP_LLI_ALPS_STAT_OK != ret) ERROR (ret, "alps_app_lli_get_response_bytes()"); ret = alps_get_appinfo(apid, &app_info, &app_cmddetail, &app_places); if (-1 == ret) ERROR (ret, "alps_get_appinfo()");