- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

Workspace migration: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
 
(17 intermediate revisions by the same user not shown)
Line 10: Line 10:
== Data migration to new workspaces ==
== Data migration to new workspaces ==


* On Hawk the workspace filesystem is ws11. On Hunter the workspace filesystem is ws12. The ws12 workspace filesystem from Hunter is also mounted on Hawk under the identical path name.
* On Hawk the workspace filesystem is ws11. On Hunter the workspace filesystem is ws12. The ws12 workspace filesystem from Hunter is also mounted on Hawk under the identical path name. (<font color=red>'''ws11 is not available on hunter!'''</font>)


* The first step is to create a workspace on the ws12 filesystem. Creation of workspaces on ws12 is only possible on the Hunter supercomputer. So, login to hunter, create a workspace and remenber the path name of your workspace by running the command ''ws_list -a''. (see [[Workspace_mechanism | workspace mechanism document]])  
* The first step is to create a workspace on the ws12 filesystem. Creation of workspaces on ws12 is only possible on the Hunter supercomputer. Log in to hunter, create a workspace and remember the path names of your target workspaces. Use the command ws_list -a. (see [[Workspace_mechanism | workspace mechanism document]])


* The next steps take place on Hawk. Execute the command ''ws_list -a'' on a Hawk frontend system to display the path names of your workspace on ws11.
* The next steps take place on Hawk. Execute the command ''ws_list -a'' on a Hawk frontend system to display the path names of your workspaces from ws11.
** Now you have the source paths of your workspace directories from ws11 and the target paths that are located on the hunter workspace filesystem ws12 and are ready for the actual migration of the data. Please use the [[Workspace_migration#Using_mpifileutils_for_data_transfer | mpifileutils '''dcp''' or '''dsync''']] described in the following chapters.
** Now you have the source paths of your workspace directories from ws11 and the target paths that are located on the hunter workspace filesystem ws12 and are ready for the actual migration of the data. Please use the [[Workspace_migration#Using_mpifileutils_for_data_transfer | mpifileutils '''dcp''']] described in the following chapters.


== Before you start ==
== Before you start ==
Line 23: Line 23:
== Important remarks ==
== Important remarks ==
* If you have to migrate data residing in workspaces from one to another filesystem, do not use the ''mv'' command to transfer data. For large amounts of data, this will fail due to time limits. Currently we recommend for e.g. millions of small files or for large amounts of data to use the following command inside a single node batch job: ''rsync -a  --hard-links  Old_ws/  new_ws/'' .
* If you have to migrate data residing in workspaces from one to another filesystem, do not use the ''mv'' command to transfer data. For large amounts of data, this will fail due to time limits. Currently we recommend for e.g. millions of small files or for large amounts of data to use the following command inside a single node batch job: ''rsync -a  --hard-links  Old_ws/  new_ws/'' .
* Please try to use the [[Workspace_migration#Using_mpifileutils_for_data_transfer | mpifileutils '''dcp''' or '''dsync''']].
* Please try to use the [[Workspace_migration#Using_mpifileutils_for_data_transfer | mpifileutils '''dcp''']].
* Take care when you create new batch jobs. If you have to migrate your workspace from an old filesystem to the new location, this takes time. Do not run any job while the migration process is active. This may result in inconsistent data.
* Take care when you create new batch jobs. If you have to migrate your workspace from an old filesystem to the new location, this takes time. Do not run any job while the migration process is active. This may result in inconsistent data.


== Operation / Policies of the workspaces on ws11: ==
== Operation / Policies of the workspaces: ==


* No job of any user-group member will be scheduled for computation as long as the group quota is exceeded.
* No job of any user-group member will be scheduled for computation as long as the group quota is exceeded.
Line 34: Line 34:
* Max. number of workspace extensions are 3.
* Max. number of workspace extensions are 3.
* Please read related man pages or the online [[Workspace_mechanism | workspace mechanism document]]<BR>
* Please read related man pages or the online [[Workspace_mechanism | workspace mechanism document]]<BR>
: in particular note that the workspace tools allow to explicitly address a specific workspace file system using the <tt>-F</tt> option (e.g. <tt>ws_allocate -F ws11.0 my_workspace 10</tt>)
* To list your available workspace file systems use <tt>ws_list -l</tt> (for ws11 on Hawk, for ws12 on Hunter).
* To list your available workspace file systems use <tt>ws_list -l</tt>.
* Users can restore expired workspaces using ''ws_restore''.
* Users can restore expired workspaces using ''ws_restore''.


Line 43: Line 42:
The mpifileutils suite provides MPI-based tools to handle typical jobs like copy, remove, and compare for large  datasets, providing speedups of up to 50x compared to single process jobs. It can only be run on compute nodes via mpirun.
The mpifileutils suite provides MPI-based tools to handle typical jobs like copy, remove, and compare for large  datasets, providing speedups of up to 50x compared to single process jobs. It can only be run on compute nodes via mpirun.


dcp or dsync is similar to cp -r or rsync; simply provide a source directory and destination and dcp / dsync will recursively copy the source directory to the destination in parallel.
dcp is similar to cp -r; simply provide a source directory and destination and dcp will recursively copy the source directory to the destination in parallel.


dcp / dsync has a number of useful options; use dcp -h or dsync -h to see a description or use the [[https://mpifileutils.readthedocs.io/en/v0.11.1/ User Guide]].
dcp has a number of useful options; use dcp -h to see a description or use the [[https://mpifileutils.readthedocs.io/en/v0.11.1/ User Guide]].


It should be invoked via mpirun.
It should be invoked via mpirun.


We highly recommend to use dcp / dsync with an empty ~/.profile and ~/.bashrc only! Furthermore, take care that only the following modules are loaded when using mpifileutils (this can be achieved by logging into the system without modifying the list of modules and loading only the modules openmpi and mpifileutils): <br>
We highly recommend to use dcp with an empty ~/.profile and ~/.bashrc only! Furthermore, take care that only the following modules are loaded when using mpifileutils (this can be achieved by logging into the system without modifying the list of modules and loading only the module mpifileutils): <br>
1) system/site_names <br>
1) system/site_names
2) system/ws/8b99237 <br>
2) system/ws/20241114
3) system/wrappers/1.0 <br>
3) system/wrappers/1.0
4) hlrs-software-stack/current <br>
4) hlrs-software-stack/current
5) gcc/10.2.0 <br>
5) gcc/12.3.0
6) openmpi/4.1.4 <br>
6) system/ucx/1.16.0-gcc-8.5.0
7) mpifileutils/0.11 <br>
7) mpt/2.30
 
8) mpifileutils/0.11.1


=== dcp ===
=== dcp ===
Line 71: Line 70:


We highly recommend using the '''-p''' option.
We highly recommend using the '''-p''' option.
=== dsync ===
Parallel MPI application to synchronize two files or two directory trees.
dsync makes DEST match SRC, adding missing entries from DEST, and updating existing entries in DEST as necessary so that SRC and DEST have identical content, ownership, timestamps, and permissions.
'''--chunksize C''': Copy files larger than C Bytes in C Byte chunks (default ist 4MB)


=== Job Script example ===
=== Job Script example ===
Line 89: Line 80:
  #!/bin/bash
  #!/bin/bash
  #PBS -N parallel-copy
  #PBS -N parallel-copy
  #PBS -l select=2:node_type=rome:mpiprocs=128
  #PBS -l select=2:node_type=rome:mpiprocs=24
  #PBS -l walltime=00:20:00
  #PBS -l walltime=00:20:00
   
   
  module load openmpi mpifileutils
  module load mpifileutils
   
   
  SOURCEDIR=<YOUR SOURCE DIRECTORY HERE>
  SOURCEDIR=<YOUR SOURCE DIRECTORY HERE>
Line 98: Line 89:
   
   
  sleep 5
  sleep 5
  nodes=$(cat $PBS_NODEFILE | sort -u | wc -l)
  cores=$(cat $PBS_NODEFILE | wc -l)
let cores=nodes*20
   
   
  time_start=$(date "+%c  :: %s")
  time_start=$(date "+%c  :: %s")
  #mpirun -np $cores dcp -p --bufsize 8MB ${SOURCEDIR}/ ${TARGETDIR}/
  mpirun -np $cores dcp -p --bufsize 8MB ${SOURCEDIR}/ ${TARGETDIR}/
mpirun -np $cores dsync --bufsize 8MB $SOURCEDIR $TARGETDIR
  time_end=$(date "+%c  :: %s")
  time_end=$(date "+%c  :: %s")
   
   

Latest revision as of 14:23, 21 February 2025


Note: This page describes the necessary steps to migrate workspaces to another workspace filesystems.

HPE Hawk will be finally shut down at the end of April and with it the availability of the workspace file system ws11 mounted on hawk and all the data located on it.

For projects that are to be continued from Hawk to the next generation supercomputer Hunter (HPE), it is therefore very important to also transfer the existing and required workspace data to Hunter (HPE).


Data migration to new workspaces

  • On Hawk the workspace filesystem is ws11. On Hunter the workspace filesystem is ws12. The ws12 workspace filesystem from Hunter is also mounted on Hawk under the identical path name. (ws11 is not available on hunter!)
  • The first step is to create a workspace on the ws12 filesystem. Creation of workspaces on ws12 is only possible on the Hunter supercomputer. Log in to hunter, create a workspace and remember the path names of your target workspaces. Use the command ws_list -a. (see workspace mechanism document)
  • The next steps take place on Hawk. Execute the command ws_list -a on a Hawk frontend system to display the path names of your workspaces from ws11.
    • Now you have the source paths of your workspace directories from ws11 and the target paths that are located on the hunter workspace filesystem ws12 and are ready for the actual migration of the data. Please use the mpifileutils dcp described in the following chapters.

Before you start

Migration for large amount of data consumes a lot of IO ressources. Please review and remove data not needed any more or move it into HPSS.

Important remarks

  • If you have to migrate data residing in workspaces from one to another filesystem, do not use the mv command to transfer data. For large amounts of data, this will fail due to time limits. Currently we recommend for e.g. millions of small files or for large amounts of data to use the following command inside a single node batch job: rsync -a --hard-links Old_ws/ new_ws/ .
  • Please try to use the mpifileutils dcp.
  • Take care when you create new batch jobs. If you have to migrate your workspace from an old filesystem to the new location, this takes time. Do not run any job while the migration process is active. This may result in inconsistent data.

Operation / Policies of the workspaces:

  • No job of any user-group member will be scheduled for computation as long as the group quota is exceeded.
  • Accounting.
  • Max. lifetime of a workspace is currently 60 days.
  • Default lifetime of a workspace is 1 day.
  • Max. number of workspace extensions are 3.
  • Please read related man pages or the online workspace mechanism document
  • To list your available workspace file systems use ws_list -l (for ws11 on Hawk, for ws12 on Hunter).
  • Users can restore expired workspaces using ws_restore.

Please read https://kb.hlrs.de/platforms/index.php/Storage_usage_policy

Using mpifileutils for data transfer

The mpifileutils suite provides MPI-based tools to handle typical jobs like copy, remove, and compare for large datasets, providing speedups of up to 50x compared to single process jobs. It can only be run on compute nodes via mpirun.

dcp is similar to cp -r; simply provide a source directory and destination and dcp will recursively copy the source directory to the destination in parallel.

dcp has a number of useful options; use dcp -h to see a description or use the [User Guide].

It should be invoked via mpirun.

We highly recommend to use dcp with an empty ~/.profile and ~/.bashrc only! Furthermore, take care that only the following modules are loaded when using mpifileutils (this can be achieved by logging into the system without modifying the list of modules and loading only the module mpifileutils):

1) system/site_names
2) system/ws/20241114
3) system/wrappers/1.0
4) hlrs-software-stack/current
5) gcc/12.3.0
6) system/ucx/1.16.0-gcc-8.5.0
7) mpt/2.30
8) mpifileutils/0.11.1

dcp

Parallel MPI application to recursively copy files and directories.

dcp is a file copy tool in the spirit of cp(1) that evenly distributes the work of scanning the directory tree, and copying file data across a large cluster without any centralized state. It is designed for copying files that are located on a distributed parallel file system, and it splits large file copies across multiple processes.

Run dcp with the -p option to preserve permissions and timestamps, and ownership.

-p  : preserve permissions and timestamps, and ownership

--chunksize C: Copy files larger than C Bytes in C Byte chunks (default ist 4MB)

We highly recommend using the -p option.

Job Script example

Here is an example of a job script.

You have to change the SOURCEDIR and TARGETDIR according to your setup. Also the number of nodes and wallclock time should be adjusted.


#!/bin/bash
#PBS -N parallel-copy
#PBS -l select=2:node_type=rome:mpiprocs=24
#PBS -l walltime=00:20:00

module load mpifileutils

SOURCEDIR=<YOUR SOURCE DIRECTORY HERE>
TARGETDIR=<YOUR TARGET DIRECTORY HERE>

sleep 5
cores=$(cat $PBS_NODEFILE | wc -l)

time_start=$(date "+%c  :: %s")
mpirun -np $cores dcp -p --bufsize 8MB ${SOURCEDIR}/ ${TARGETDIR}/
time_end=$(date "+%c  :: %s")

tt_start=$(echo $time_start | awk {'print $9'})
tt_end=$(echo $time_end | awk {'print $9'})
(( total_time=$tt_end-$tt_start ))
echo "Total runtime in seconds: $total_time"