- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

Workspace migration: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
Line 91: Line 91:
-b C: Copy files larger than C Mbytes in C Mbyte chunks
-b C: Copy files larger than C Mbytes in C Mbyte chunks


== Algorithm ==
=== Algorithm ===


pcp runs in two phases:
pcp runs in two phases:

Revision as of 18:35, 6 December 2016

User migration on new workspaces

In December 2016 the workspaces installed in 2011 with the Cray Xe6 System Hermit will be replaced. In preparation for this task, users have to migrate their workspaces onto the replacement filesystems. Run the command ws_list -a on a frontend system to display the path for all your workspaces, if path names matches workspaces in following table, this workspace needs to be migrated.

File System alias mounted on
ws1 univ_1 /lustre/cray/ws1
ws2 ind_1 /lustre/cray/ws2
ws3 univ_2 /lustre/cray/ws3
ws4 res_1 /lustre/cray/ws4
ws5 ind_2 /lustre/cray/ws5
ws6 res_2 /lustre/cray/ws6

before you start

Migration for large amount of data consumes a lot of IO ressources. Please review and remove data which you do not need any more.

How to proceed ( Version 1)

  • Users have got access to the replacement Workspaces. To find out which one, try following command:
    • ws_allocate –F ws7 test_ws 5 # if this command run successful, you should prepare your Jobs using this workspace and submit all new compute Jobs utilizing this workspace.
    • If above command fails, following command should work: ws_allocate –F ws8 test _ws 5 # if not contact your project supervisor.
  • Run all new submitted Jobs within workspaces in the new location.
  • Migrate data from the “old” location into the fresh created workspace (please double check this target directory is located in either ws7 or ws8 directory tree).

to migrate we suggest following command: rsync -a Old_ws new_ws

  • On November 7th 2016 the default will be changed. Please ensure your jobs are using the new workspace directory
  • On December 7th the “old” workspaces ws1, … ws6 will be disconnected from the Cray system. The Filesystems will be available on the frontend systems for data migration until 11th January 2017
  • January 15th 2017 all data on the old filesystems will be deleted.


How to proceed ( Version 2)

  • On October, 25th 2016 new workspaces will be allocated on the replacment filesystems. Available workspace will be listed also on the old filesystems.
  • workspaces on old filesystems could not be extended.
  • if you have to migrate data from workspaces located in a different filesystem, do not use the mv command to transfer data. For large amount of data, this will fail. We recomment using following command: rsync -a ??? Old_ws new_ws
  • take care when you create new batch jobs. If you have to migrate your workspace from from an old filesystem to the new location, this take some time to transfer large amount of data. Do not run any Job while the migration process is active. This may result in incosistant data.
  • On December 7th the “old” workspaces ws1, … ws6 will be disconnected from the Cray system. The Filesystems will be available on the frontend systems for data migration until 11th January 2017
  • January 15th 2017 all data on the old filesystems will be deleted.


Using a parallel copy for data transfer

pcp is a python based parallel copy using MPI. It can only be run on compute nodes via aprun.

pcp is similar to cp -r; simply give it a source directory and destination and pcp will recursively copy the source directory to the destination inparallel.

pcp has a number of useful options; use pcp -h to see a description.

This program traverses a directory tree and copies files in the tree in parallel. It does not copy individual files in parallel. It should be invoked via aprun.

Basic arguments

If run with the -l flag or -lf flags pcp will be stripe aware. -l will cause stripe information to be copied from the source files and directories. -lf will cause all files and directories on the destination to be striped, regardless of the striping on the source.

Striping behavior can be further modified with -ls and -ld. A minimum file size can be set with -ls. Files below this size will not be striped, regardless of the source striping.

-ld will cause all directories to be unstriped.

-b C: Copy files larger than C Mbytes in C Mbyte chunks

Algorithm

pcp runs in two phases:

Phase I is a parallel walk of the file tree, involving all MPI ranks in a peer-to-peer algorithm. The walk constructs the list of files to be copied and creates the destination directory hierarchy.

In phase II, the actual files are copied. Phase II uses a master-slave algorithm. R0 is the master and dispatches file copy instructions to the slaves (R1...Rn).

Job Script example

There is a job script example located /opt/hlrs/wrappers/bin/pcp.qsub

/opt/hlrs/wrappers/bin> cat pcp.qsub !/bin/bash

  1. PBS -N IO_copy_test
  2. PBS -l nodes=8
  3. PBS -l walltime=0:30:00
  4. PBS -joe

cd $PBS_O_WORKDIR

module load tools/python/2.7.8

SOURCEDIR=<YOUR SOURCE DIRECTORY HERE> TARGETDIR=<YOUR TARGET DIRECTORY HERE>

/usr/bin/time -p aprun -n 192 -N24 -d1 ~/.local/bin/pcp -ls 1048576 -b 4194304 $SOURCEDIR $TARGETDIR /opt/hlrs/wrappers/bin>

Operation of the workspaces will be changed:

  • To create a workspace or extend an existent workspace, an interactive shell is necessary.
  • Due to a drop of performance on high usage of quota, no job of any group member will be scheduled for computation as long as the group quota exceeds 80%. All blocked group members get a notice by E-mail (if a valid address is registered)
  • accounting
  • werden wir eine längere Laufzeit zulassen? Denke wir sollten, schon wegen 4 Wochen Urlaub... wie wäre es mit 60 Tagen???

Please read https://kb.hlrs.de/platforms/index.php/Storage_usage_policy