- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

CRAY XC40 Hardware and Architecture: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
m (RAM type)
 
(37 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Installation step 2a (hornet) ==
== Hazelhen production system ==
=== Summary Phase 1 Step 2a ===
 
=== Summary Hazelhen Production system ===
 
{|Class=wikitable
{|Class=wikitable
|-
|-
! Cray Cascade [http://www.cray.com/Products/Computing/XC.aspx XC30] Supercomputer
! Cray Cascade [http://www.cray.com/Products/Computing/XC.aspx XC40] Supercomputer
! Step 2a
! Step 2
|-
| Performance
* Peak
* [http://top500.org/ top500] [http://top500.org/site/50543 HPL]<br/><br/><br/>
* [http://www.hpcg-benchmark.org/ HPCG]<br/><br/><br/>
| <BR>
7.4  Pflops<BR>
5.64  Pflops (76% Peak),<br/>[http://top500.org/lists/2015/11/ November 2015] list rank 8<br/>([http://top500.org/lists/2016/06/ 2016/06] rank 9, [https://www.top500.org/list/2016/11/?page=1 2016/11] rank 14, [https://www.top500.org/list/2017/06/?page=1 2017/06] rank 17, [https://www.top500.org/list/2017/11/?page=1 2017/11] rank 19, [https://www.top500.org/list/2018/06/?page=1 2018/06] rank 27, [https://www.top500.org/list/2018/11/?page=1 2018/11] rank 30)<BR>
0.138 Pflops ( 2% Peak),<br/>[http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=282 November 2015] HPCG results rank 6<br/>([http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=288 2016/06] rank 10, [http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=289 2016/11] rank 12, [http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=291 2017/06] rank 13, [http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=293 2017/11] rank 14, [http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=295 2018/06] rank 17, [http://www.hpcg-benchmark.org/custom/index.html?lid=155&slid=297 2018/11] rank 20)<BR>
|-
|-
| Cray Cascade Cabinets  
| Cray Cascade Cabinets  
| 1
| 41
|-
|-
| Number of Compute Nodes
| Number of Compute Nodes
| 164
| 7712 (dual socket)
|-
|-
| Number of Compute Processors
| Compute Processors
| 328 Intel SandyBridge 2,6 GHz, 8 Cores
* Total number of CPUs
* Total number of Cores
| <BR>
7712*2= 15424 Intel Haswell [http://ark.intel.com/products/81908/Intel-Xeon-Processor-E5-2680-v3-30M-Cache-2_50-GHz E5-2680v3] 2,5 GHz, 12 Cores, 2 HT/Core<BR>
15424*12= 185088
|-
|-
| Compute Memory on Scalar Processors
| Compute Memory on Scalar Processors
Line 20: Line 35:
* Total Scalar Compute Memory
* Total Scalar Compute Memory
| <BR>
| <BR>
DDR3 1600 MHz <BR>
DDR4-2133 registered ECC<BR>
64GB <BR>
128GB <BR>
10TB <BR>
987136GB= 964TB <BR>
|-
| I/O Nodes
| 14
|-
|-
| Interconnect
| Interconnect
| Cray Aries
| Cray Aries
|-
| Service Nodes (I/O and Network)
| 90
|-
|-
| External Login Servers
| External Login Servers
| 2
| 10
|-
|-
| Pre- and Post-Processing Servers
| Pre- and Post-Processing Servers
| -
| 3 Cray CS300: each with 4x Intel(R) Xeon(R) CPU E5-4620 v2 @ 2.60GHz (Ivy Bridge), 32 cores, 512 GB DDR3 Memory (PC3-14900R), 7,1TB scratch disk space (4x ~2TB RAID0), NVidia Quadro K6000 (12 GB GDDR5), single job usage
<BR>
5 Cray CS300: each with 2x Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz, 16 cores, 256GB DDR3 Memory (PC3-14900R), 3,6TB scratch disk space (2x ~1,8TB), NVidia Quadro K5000 (4 GB GDDR5), single job usage
 
<BR>
3 Supermicro Superserver: each with 4x Intel Xeon X7550 (Nehalem EX OctCore), 2.00GHz (4*8=32 Cores for 32*2=64 HyperThreads) 128GB RAM, 5,5TB scratch disk space (10x ~600GB), NVidia Quadro 6000 (GF100 Fermi) GPU, 14 SM, 448 Cuda Cores, 6 GB GDDR5 RAM (384bit Interface with 144 GB/s), single job usage
 
<BR>
1 Supermicro Superserver: with 8x Intel Xeon X7550 (Nehalem EX OctCore), 2.00GHz (4*8=32 Cores for 32*2=64 HyperThreads) 1TB RAM, 6,6TB scratch disk space (14x ~600GB), NVidia Quadro 6000 (GF100 Fermi) GPU, 14 SM, 448 Cuda Cores, 6 GB GDDR5 RAM (384bit Interface with 144 GB/s), multi job usage
 
<BR>
2 Cray CS300: each with 4x Intel(R) Xeon(R) CPU E5-4620 v2 @ 2.60GHz (Ivy Bridge), 32 cores, 1536 GB DDR3 Memory (PC3-14900R), 15 TB scratch disk space (4x ~4TB RAID0), NVidia Quadro K6000 (12 GB GDDR5), multi job usage
|-
|-
| User Storage
| User Storage
* Lustre Workspace Capacity
* Lustre Workspace Capacity
| <BR>
| <BR>
(330TB)
~10 PB
|-
|-
|Cray Linux Environment (CLE)
|Cray Linux Environment (CLE)
Line 93: Line 119:
|-
|-
|}
|}
For detailed information see [https://www.hlrs.de/fileadmin/_assets/events/workshops/XC40_Intro_2014-09.pdf XC40-Intro]
For information on the Aries network see [[Communication_on_Cray_XC40_Aries_network]]


=== Architecture ===
=== Architecture ===
* System Management Workstation (SMW)
* System Management Workstation (SMW)
** system administrator's console for managing a Cray system like monitoring, installing/upgrading software, controls the hardware, starting and stopping the XC30 system.
** system administrator's console for managing a Cray system like monitoring, installing/upgrading software, controls the hardware, starting and stopping the XC40 system.


* service nodes are classified in:
* service nodes are classified in:
Line 108: Line 138:


* compute nodes
* compute nodes
** are only available for user using the [[CRAY_XC30_Using_the_Batch_System_SLURM | batch system]] and the Application Level Placement Scheduler (ALPS), see [http://docs.cray.com/cgi-bin/craydoc.cgi?mode=Show;q=2496;f=/books/S-2496-5001/html-S-2496-5001/cnl_apps.html running applications].
** are only available for user using the [[CRAY_XC40_Using_the_Batch_System | batch system]] and the Application Level Placement Scheduler (ALPS), see [http://docs.cray.com/cgi-bin/craydoc.cgi?mode=Show;q=2496;f=/books/S-2496-5001/html-S-2496-5001/cnl_apps.html running applications].
*** The compute nodes are installed with 64 GB memory, each with fast interconnect (CRAY Aries).
*** The compute nodes are installed with 128 GB memory, each with fast interconnect (CRAY Aries).
*** [http://www.cray.com/Assets/PDF/products/xc/CrayXC30Networking.pdf Details about the interconnect of the Cray XC series network]
*** [http://www.cray.com/Assets/PDF/products/xc/CrayXC30Networking.pdf Details about the interconnect of the Cray XC series network] and [[Communication_on_Cray_XC40_Aries_network]]


* in future, the StorageSwitch Fabric of step2a and step1 will be connected. So, the Lustre workspace filesystems can be used on both hardware (Login servers and preprocessing servers) of step1 and step2a.
* in future, the StorageSwitch Fabric of step2a and step1 will be connected. So, the Lustre workspace filesystems can be used on both hardware (Login servers and preprocessing servers) of step1 and step2a.


[[File:step2a-concept.jpg]]
[[File:step2a-concept.jpg]]
=== Pictures ===
[[Image:hazelhen.jpg]]
[[Image:hazelhen-cooling1.jpg]]
[[Image:hazelhen-behind-front.jpg]]
[[Image:hazelhen-blade1.jpg]]
[[Image:hazelhen-blade2.jpg]]
[[Image:Hermit1-Folie7.jpg]]
[[Image:Hermit1-Folie8.jpg]]
[[Image:Hermit1-Folie9.jpg]]

Latest revision as of 09:25, 27 January 2020

Hazelhen production system

Summary Hazelhen Production system

Cray Cascade XC40 Supercomputer Step 2
Performance

7.4 Pflops
5.64 Pflops (76% Peak),
November 2015 list rank 8
(2016/06 rank 9, 2016/11 rank 14, 2017/06 rank 17, 2017/11 rank 19, 2018/06 rank 27, 2018/11 rank 30)
0.138 Pflops ( 2% Peak),
November 2015 HPCG results rank 6
(2016/06 rank 10, 2016/11 rank 12, 2017/06 rank 13, 2017/11 rank 14, 2018/06 rank 17, 2018/11 rank 20)

Cray Cascade Cabinets 41
Number of Compute Nodes 7712 (dual socket)
Compute Processors
  • Total number of CPUs
  • Total number of Cores

7712*2= 15424 Intel Haswell E5-2680v3 2,5 GHz, 12 Cores, 2 HT/Core
15424*12= 185088

Compute Memory on Scalar Processors
  • Memory Type
  • Memory per Compute Node
  • Total Scalar Compute Memory

DDR4-2133 registered ECC
128GB
987136GB= 964TB

Interconnect Cray Aries
Service Nodes (I/O and Network) 90
External Login Servers 10
Pre- and Post-Processing Servers 3 Cray CS300: each with 4x Intel(R) Xeon(R) CPU E5-4620 v2 @ 2.60GHz (Ivy Bridge), 32 cores, 512 GB DDR3 Memory (PC3-14900R), 7,1TB scratch disk space (4x ~2TB RAID0), NVidia Quadro K6000 (12 GB GDDR5), single job usage


5 Cray CS300: each with 2x Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz, 16 cores, 256GB DDR3 Memory (PC3-14900R), 3,6TB scratch disk space (2x ~1,8TB), NVidia Quadro K5000 (4 GB GDDR5), single job usage


3 Supermicro Superserver: each with 4x Intel Xeon X7550 (Nehalem EX OctCore), 2.00GHz (4*8=32 Cores for 32*2=64 HyperThreads) 128GB RAM, 5,5TB scratch disk space (10x ~600GB), NVidia Quadro 6000 (GF100 Fermi) GPU, 14 SM, 448 Cuda Cores, 6 GB GDDR5 RAM (384bit Interface with 144 GB/s), single job usage


1 Supermicro Superserver: with 8x Intel Xeon X7550 (Nehalem EX OctCore), 2.00GHz (4*8=32 Cores for 32*2=64 HyperThreads) 1TB RAM, 6,6TB scratch disk space (14x ~600GB), NVidia Quadro 6000 (GF100 Fermi) GPU, 14 SM, 448 Cuda Cores, 6 GB GDDR5 RAM (384bit Interface with 144 GB/s), multi job usage


2 Cray CS300: each with 4x Intel(R) Xeon(R) CPU E5-4620 v2 @ 2.60GHz (Ivy Bridge), 32 cores, 1536 GB DDR3 Memory (PC3-14900R), 15 TB scratch disk space (4x ~4TB RAID0), NVidia Quadro K6000 (12 GB GDDR5), multi job usage

User Storage
  • Lustre Workspace Capacity

~10 PB

Cray Linux Environment (CLE)
  • Compute Node Linux
  • Cluster Compatibility Mode (CCM)
  • Data Virtualization Services (DVS)
Yes
PGI Compiling Suite (FORTRAN, C, C++) including Accelerator 25 user (shared with Step 1)
Cray Developer Toolkit
  • Cray Message Passing Toolkit (MPI, SHMEM, PMI, DMAPP, Global Arrays)
  • PAPI
  • GNU compiler and libraries
  • JAVA
  • Environment setup (Modules)
  • Cray Debugging Support Tools
    • Lgdb
    • STAT
    • ATP
Unlimited Users
Cray Programming Environment
  • Cray Compiling Environment (FORTRAN, C, C++)
  • Cray Performance Monitoring and Analysis
    • Cray PAT
    • Cray Apprentice2
  • Cray Math and Scientific Libraries
    • Cray Optimized BLAS
    • Cray Optimized LAPACK
    • Cray Optimized ScaLAPACK
    • IRT (Iterative Refinement Toolkit)
Unlimited Users
Alinea DDT Debugger 2048 Processes (shared with Step 1)
Lustre Parallel Filesystem Licensed on all Sockets
Intel Composer XE
  • Intel C++ Compiler XE
  • Intel Fortran Compiler XE
  • Intel Parallel Debugger Extension
  • Intel Integrated Performance Primitives
  • Intel Cilk Plus
  • Intel Parallel Building Blocks
  • Intel Threading Building Blocks
  • Intel Math Kernel Library
10 Seats

For detailed information see XC40-Intro

For information on the Aries network see Communication_on_Cray_XC40_Aries_network

Architecture

  • System Management Workstation (SMW)
    • system administrator's console for managing a Cray system like monitoring, installing/upgrading software, controls the hardware, starting and stopping the XC40 system.
  • service nodes are classified in:
    • login nodes for users to access the system
    • boot nodes which provides the OS for all other nodes, licenses,...
    • network nodes which provides e.g. external network connections for the compute nodes
    • Cray Data Virtualization Service (DVS): is an I/O forwarding service that can parallelize the I/O transactions of an underlying POSIX-compliant file system.
    • sdb node for services like ALPS, torque, moab, slurm, cray management services,...
    • I/O nodes for e.g. lustre
    • MOM nodes for placing user jobs of the batch system in to execution
  • in future, the StorageSwitch Fabric of step2a and step1 will be connected. So, the Lustre workspace filesystems can be used on both hardware (Login servers and preprocessing servers) of step1 and step2a.

Step2a-concept.jpg

Pictures

Hazelhen.jpg Hazelhen-cooling1.jpg Hazelhen-behind-front.jpg Hazelhen-blade1.jpg Hazelhen-blade2.jpg Hermit1-Folie7.jpg Hermit1-Folie8.jpg Hermit1-Folie9.jpg