- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

Hunter installation schedule: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
(Created page with "This page will be updated as new information becomes available. Please be aware, HPC systems are made out of leading edge components. If one of this components is delayed, the complete schedule will change. Do not take this schedule too serious... {| class="wikitable" |- ! time frame !! action |- | June 18th - 20 || Hawk maintenance and preparation of infrastructure (1st part). System will be reduced to 4096 compute nodes in total |- | mid of August || delivery parts...")
 
(Tabelle aktualisier)
 
(16 intermediate revisions by 3 users not shown)
Line 1: Line 1:
This page will be updated as new information becomes available. Please be aware, HPC systems are made out of leading edge  
This page will be updated as new information becomes available. Please be aware, HPC systems are made out of leading edge  
components. If one of this components is delayed, the complete schedule will change. Do not take this schedule too serious...
components. If one of this components is delayed, the complete schedule will change. Do not take this schedule too serious...
More information could be found [[https://www.hlrs.de/news/detail/hpc-in-transition-hunter-and-herder-will-bring-new-opportunities-new-challenges | hpc in transition hunter and herder will bring new opportunities new challenges]] and [[https://www.hlrs.de/news/detail/exascale-supercomputing-is-coming-to-stuttgart | exascale supercomputing is coming to stuttgart]]
fact-sheet can be found [[Media:HLRS Hunter mit Freigabe.pdf|here]].


{| class="wikitable"
{| class="wikitable"
Line 6: Line 10:
! time frame !! action
! time frame !! action
|-
|-
| June 18th - 20 || Hawk maintenance and preparation  
| June 18th - 19th || Hawk maintenance and preparation  
of infrastructure (1st part). System will be reduced to 4096 compute nodes in total
of infrastructure (1st part). System will be reduced to 4096 compute nodes in total
|-
|-
| mid of August || delivery parts of Hunter  
| October 1st || delivery parts of Hunter (racks, APU and CPU nods, cooling, mgmt-systems, ...)
|-
| January 31st  || Acceptance Hunter and starting data migration onto new storage platform
|-
| end of April  || Hawk final power down
|-
| ========== || alter Kram
|-
| mid-February  || Test Phase for pilot users  ~ 3 weeks. Due to the delayed delivery, the test phase will also be postponed. The filesystem will be ws9 (Hazel Hen)
|-
| February 24 2020 || Final shutdown and decommission of Hazel Hen
and preparation of infrastructure for the complete installation of Hawk.
|-
|-
| February 25 2020 || hazelhen workspace filesystem (ws9) has been fully integrated with all workspaces into the hawk system.
| early November  || software installation and first application tests by pilot users. December 10st - first pilot users are active and are going to install additional HLRS provided software.  
|-
|-
| until March 1st || testing, and integrate additional racks into
| November 18th || delivery of lustre filesystems. December 9th - Lustre ws12 has been installed and will be tested for some time.
first phase (2048 nodes)
|-
|-
| March 9th || general availability for all users
| Q4 / 2024 || Hawk will be reduced by another 8 racks (if projects are active on Hunter). This step has been postponed until a proper use of Hunter is possible
|-
|-
| until March 15 || prepare power and cooling facility for second phase hawk
| January 31st 2025 || Acceptance Hunter and starting data migration onto new storage platform
|-
|-
| July 20th || connect racks 1 ... 16 with tested racks 17 ... 40 This step may need 15 days. We will try to run user jobs on racks 41 ... 44 ||
| end of April 2025 || Hawk final power down
|-
| July 23th || The 512 node interim system is available for users. These 4 racks will continuously provide computing resources while additional
System tests and benchmarks take place on the 40 Rack Hawk system.
|-
| August || System functionality test. (Large applications, IO-performance, MPI - performance / funtionality, power consumption, system stability, ...).
Optimizations of the cooling system are not yet completed.
|-
| September 18th || System Acceptance. Large benchmarks run successful, user access on Hawk will be possible within the next few days
|-
Start of production
|-
| October 12th|| integrate racks 41 ... 44 into production system. System will be unavailable for ~ 2 weeks
|-
| October 26 || start regular user operation
|-
| May 2021 || [[Workspace_migration|Data migration]] by users from ws9 (Hazel Hen) onto ws10 (Hawk) filesystem
|-
|-
| || installation phase finished
| || installation phase finished
Line 53: Line 29:




== Terms of Use for installation phase ==
The following illustration shows the current planning of the Hunter installation. Please note that the schedule is subject to change.
we are pleased to provide early user access to Hawk.  
 
Please note that the system is still far from production status in terms of stability / performance / configuration / usage.
* Both the node configuration (such as numa domains per socket) and the InfiniBand configuration are not yet final und both will be subject to change.
* This means that the performance of the system is not optimal. It also implies that the users should not yet attempt an optimization of their applications based on the current setup.
* No monitoring system active, this may cause failed jobs if compute nodes break down and will be part of subsequent jobs


The usage is granted under the following conditions:
[[File:240614-Hunter-Timeline.jpg|thumb|Installation planning as of June 2024]]
* Do not publish performance measurements of the current system configuration
* Do not monopolize the system. Give other users the opportunity to use the system. Avoid long running jobs.
* If you encounter problems, please report this to the prepared Trouble Ticket System via email to:
    rt-platform-hawk@hlrs.de
* Due to the current state of the system, your application may not yet be operational. In this case, please wait a few more days. The system will be improved very quickly.

Latest revision as of 09:17, 12 December 2024

This page will be updated as new information becomes available. Please be aware, HPC systems are made out of leading edge components. If one of this components is delayed, the complete schedule will change. Do not take this schedule too serious...

More information could be found [| hpc in transition hunter and herder will bring new opportunities new challenges] and [| exascale supercomputing is coming to stuttgart] fact-sheet can be found here.


time frame action
June 18th - 19th Hawk maintenance and preparation

of infrastructure (1st part). System will be reduced to 4096 compute nodes in total

October 1st delivery parts of Hunter (racks, APU and CPU nods, cooling, mgmt-systems, ...)
early November software installation and first application tests by pilot users. December 10st - first pilot users are active and are going to install additional HLRS provided software.
November 18th delivery of lustre filesystems. December 9th - Lustre ws12 has been installed and will be tested for some time.
Q4 / 2024 Hawk will be reduced by another 8 racks (if projects are active on Hunter). This step has been postponed until a proper use of Hunter is possible
January 31st 2025 Acceptance Hunter and starting data migration onto new storage platform
end of April 2025 Hawk final power down
installation phase finished


The following illustration shows the current planning of the Hunter installation. Please note that the schedule is subject to change.

Installation planning as of June 2024