- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -
CRAY XC40 Hardware and Architecture: Difference between revisions
m (RAM type) |
|||
(51 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== | == Hazelhen production system == | ||
=== Summary | |||
{| | === Summary Hazelhen Production system === | ||
{|Class=wikitable | |||
|- | |||
! Cray Cascade [http://www.cray.com/Products/Computing/XC.aspx XC40] Supercomputer | |||
! Step 2 | |||
|- | |||
| Performance | |||
* Peak | |||
* [http://top500.org/ top500] [http://top500.org/site/50543 HPL]<br/><br/><br/> | |||
* [http://www.hpcg-benchmark.org/ HPCG]<br/><br/><br/> | |||
| <BR> | |||
7.4 Pflops<BR> | |||
5.64 Pflops (76% Peak),<br/>[http://top500.org/lists/2015/11/ November 2015] list rank 8<br/>([http://top500.org/lists/2016/06/ 2016/06] rank 9, [https://www.top500.org/list/2016/11/?page=1 2016/11] rank 14, [https://www.top500.org/list/2017/06/?page=1 2017/06] rank 17, [https://www.top500.org/list/2017/11/?page=1 2017/11] rank 19, [https://www.top500.org/list/2018/06/?page=1 2018/06] rank 27, [https://www.top500.org/list/2018/11/?page=1 2018/11] rank 30)<BR> | |||
0.138 Pflops ( 2% Peak),<br/>[http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=282 November 2015] HPCG results rank 6<br/>([http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=288 2016/06] rank 10, [http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=289 2016/11] rank 12, [http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=291 2017/06] rank 13, [http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=293 2017/11] rank 14, [http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=295 2018/06] rank 17, [http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=297 2018/11] rank 20)<BR> | |||
|- | |||
| Cray Cascade Cabinets | |||
| 41 | |||
|- | |||
| Number of Compute Nodes | |||
| 7712 (dual socket) | |||
|- | |||
| Compute Processors | |||
* Total number of CPUs | |||
* Total number of Cores | |||
| <BR> | |||
7712*2= 15424 Intel Haswell [http://ark.intel.com/products/81908/Intel-Xeon-Processor-E5-2680-v3-30M-Cache-2_50-GHz E5-2680v3] 2,5 GHz, 12 Cores, 2 HT/Core<BR> | |||
15424*12= 185088 | |||
|- | |||
| Compute Memory on Scalar Processors | |||
* Memory Type | |||
* Memory per Compute Node | |||
* Total Scalar Compute Memory | |||
| <BR> | |||
DDR4-2133 registered ECC<BR> | |||
128GB <BR> | |||
987136GB= 964TB <BR> | |||
|- | |||
| Interconnect | |||
| Cray Aries | |||
|- | |||
| Service Nodes (I/O and Network) | |||
| 90 | |||
|- | |- | ||
| External Login Servers | |||
| 10 | |||
|- | |- | ||
| Pre- and Post-Processing Servers | |||
| 3 Cray CS300: each with 4x Intel(R) Xeon(R) CPU E5-4620 v2 @ 2.60GHz (Ivy Bridge), 32 cores, 512 GB DDR3 Memory (PC3-14900R), 7,1TB scratch disk space (4x ~2TB RAID0), NVidia Quadro K6000 (12 GB GDDR5), single job usage | |||
<BR> | |||
5 Cray CS300: each with 2x Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz, 16 cores, 256GB DDR3 Memory (PC3-14900R), 3,6TB scratch disk space (2x ~1,8TB), NVidia Quadro K5000 (4 GB GDDR5), single job usage | |||
<BR> | |||
3 Supermicro Superserver: each with 4x Intel Xeon X7550 (Nehalem EX OctCore), 2.00GHz (4*8=32 Cores for 32*2=64 HyperThreads) 128GB RAM, 5,5TB scratch disk space (10x ~600GB), NVidia Quadro 6000 (GF100 Fermi) GPU, 14 SM, 448 Cuda Cores, 6 GB GDDR5 RAM (384bit Interface with 144 GB/s), single job usage | |||
<BR> | |||
1 Supermicro Superserver: with 8x Intel Xeon X7550 (Nehalem EX OctCore), 2.00GHz (4*8=32 Cores for 32*2=64 HyperThreads) 1TB RAM, 6,6TB scratch disk space (14x ~600GB), NVidia Quadro 6000 (GF100 Fermi) GPU, 14 SM, 448 Cuda Cores, 6 GB GDDR5 RAM (384bit Interface with 144 GB/s), multi job usage | |||
<BR> | |||
2 Cray CS300: each with 4x Intel(R) Xeon(R) CPU E5-4620 v2 @ 2.60GHz (Ivy Bridge), 32 cores, 1536 GB DDR3 Memory (PC3-14900R), 15 TB scratch disk space (4x ~4TB RAID0), NVidia Quadro K6000 (12 GB GDDR5), multi job usage | |||
|- | |||
| User Storage | |||
* Lustre Workspace Capacity | |||
| <BR> | |||
~10 PB | |||
|- | |||
|Cray Linux Environment (CLE) | |||
* Compute Node Linux | |||
* Cluster Compatibility Mode (CCM) | |||
* Data Virtualization Services (DVS) | |||
| Yes | |||
|- | |||
| PGI Compiling Suite (FORTRAN, C, C++) including Accelerator | |||
| 25 user (shared with Step 1) | |||
|- | |||
|Cray Developer Toolkit | |||
* Cray Message Passing Toolkit (MPI, SHMEM, PMI, DMAPP, Global Arrays) | |||
* PAPI | |||
* GNU compiler and libraries | |||
* JAVA | |||
* Environment setup (Modules) | |||
* Cray Debugging Support Tools | |||
** Lgdb | |||
** STAT | |||
** ATP | |||
| Unlimited Users | |||
|- | |||
| Cray Programming Environment | |||
* Cray Compiling Environment (FORTRAN, C, C++) | |||
* Cray Performance Monitoring and Analysis | |||
** Cray PAT | |||
** Cray Apprentice2 | |||
* Cray Math and Scientific Libraries | |||
** Cray Optimized BLAS | |||
** Cray Optimized LAPACK | |||
** Cray Optimized ScaLAPACK | |||
** IRT (Iterative Refinement Toolkit) | |||
| Unlimited Users | |||
|- | |||
| Alinea DDT Debugger | |||
| 2048 Processes (shared with Step 1) | |||
|- | |||
| Lustre Parallel Filesystem | |||
| Licensed on all Sockets | |||
|- | |||
| Intel Composer XE | |||
* Intel C++ Compiler XE | |||
* Intel Fortran Compiler XE | |||
* Intel Parallel Debugger Extension | |||
* Intel Integrated Performance Primitives | |||
* Intel Cilk Plus | |||
* Intel Parallel Building Blocks | |||
* Intel Threading Building Blocks | |||
* Intel Math Kernel Library | |||
| 10 Seats | |||
|- | |||
|- | |||
|} | |} | ||
For detailed information see [https://www.hlrs.de/fileadmin/_assets/events/workshops/XC40_Intro_2014-09.pdf XC40-Intro] | |||
For information on the Aries network see [[Communication_on_Cray_XC40_Aries_network]] | |||
=== Architecture === | |||
* System Management Workstation (SMW) | |||
** system administrator's console for managing a Cray system like monitoring, installing/upgrading software, controls the hardware, starting and stopping the XC40 system. | |||
* service nodes are classified in: | |||
** login nodes for users to [[CRAY_XC30_access| access]] the system | |||
** boot nodes which provides the OS for all other nodes, licenses,... | |||
** network nodes which provides e.g. external network connections for the compute nodes | |||
** Cray Data Virtualization Service (DVS): is an I/O forwarding service that can parallelize the I/O transactions of an underlying POSIX-compliant file system. | |||
** sdb node for services like ALPS, torque, moab, slurm, cray management services,... | |||
** I/O nodes for e.g. lustre | |||
** MOM nodes for placing user jobs of the batch system in to execution | |||
* compute nodes | |||
** are only available for user using the [[CRAY_XC40_Using_the_Batch_System | batch system]] and the Application Level Placement Scheduler (ALPS), see [http://docs.cray.com/cgi-bin/craydoc.cgi?mode=Show;q=2496;f=/books/S-2496-5001/html-S-2496-5001/cnl_apps.html running applications]. | |||
*** The compute nodes are installed with 128 GB memory, each with fast interconnect (CRAY Aries). | |||
*** [http://www.cray.com/Assets/PDF/products/xc/CrayXC30Networking.pdf Details about the interconnect of the Cray XC series network] and [[Communication_on_Cray_XC40_Aries_network]] | |||
* in future, the StorageSwitch Fabric of step2a and step1 will be connected. So, the Lustre workspace filesystems can be used on both hardware (Login servers and preprocessing servers) of step1 and step2a. | |||
[[File:step2a-concept.jpg]] | |||
=== Pictures === | |||
[[Image:hazelhen.jpg]] | |||
[[Image:hazelhen-cooling1.jpg]] | |||
[[Image:hazelhen-behind-front.jpg]] | |||
[[Image:hazelhen-blade1.jpg]] | |||
[[Image:hazelhen-blade2.jpg]] | |||
[[Image:Hermit1-Folie7.jpg]] | |||
[[Image:Hermit1-Folie8.jpg]] | |||
[[Image:Hermit1-Folie9.jpg]] |
Latest revision as of 09:25, 27 January 2020
Hazelhen production system
Summary Hazelhen Production system
Cray Cascade XC40 Supercomputer | Step 2 |
---|---|
Performance | 7.4 Pflops |
Cray Cascade Cabinets | 41 |
Number of Compute Nodes | 7712 (dual socket) |
Compute Processors
|
7712*2= 15424 Intel Haswell E5-2680v3 2,5 GHz, 12 Cores, 2 HT/Core |
Compute Memory on Scalar Processors
|
DDR4-2133 registered ECC |
Interconnect | Cray Aries |
Service Nodes (I/O and Network) | 90 |
External Login Servers | 10 |
Pre- and Post-Processing Servers | 3 Cray CS300: each with 4x Intel(R) Xeon(R) CPU E5-4620 v2 @ 2.60GHz (Ivy Bridge), 32 cores, 512 GB DDR3 Memory (PC3-14900R), 7,1TB scratch disk space (4x ~2TB RAID0), NVidia Quadro K6000 (12 GB GDDR5), single job usage
|
User Storage
|
~10 PB |
Cray Linux Environment (CLE)
|
Yes |
PGI Compiling Suite (FORTRAN, C, C++) including Accelerator | 25 user (shared with Step 1) |
Cray Developer Toolkit
|
Unlimited Users |
Cray Programming Environment
|
Unlimited Users |
Alinea DDT Debugger | 2048 Processes (shared with Step 1) |
Lustre Parallel Filesystem | Licensed on all Sockets |
Intel Composer XE
|
10 Seats |
For detailed information see XC40-Intro
For information on the Aries network see Communication_on_Cray_XC40_Aries_network
Architecture
- System Management Workstation (SMW)
- system administrator's console for managing a Cray system like monitoring, installing/upgrading software, controls the hardware, starting and stopping the XC40 system.
- service nodes are classified in:
- login nodes for users to access the system
- boot nodes which provides the OS for all other nodes, licenses,...
- network nodes which provides e.g. external network connections for the compute nodes
- Cray Data Virtualization Service (DVS): is an I/O forwarding service that can parallelize the I/O transactions of an underlying POSIX-compliant file system.
- sdb node for services like ALPS, torque, moab, slurm, cray management services,...
- I/O nodes for e.g. lustre
- MOM nodes for placing user jobs of the batch system in to execution
- compute nodes
- are only available for user using the batch system and the Application Level Placement Scheduler (ALPS), see running applications.
- The compute nodes are installed with 128 GB memory, each with fast interconnect (CRAY Aries).
- Details about the interconnect of the Cray XC series network and Communication_on_Cray_XC40_Aries_network
- are only available for user using the batch system and the Application Level Placement Scheduler (ALPS), see running applications.
- in future, the StorageSwitch Fabric of step2a and step1 will be connected. So, the Lustre workspace filesystems can be used on both hardware (Login servers and preprocessing servers) of step1 and step2a.