CRAY XE6 Disk Storage: Difference between revisions

Revision as of 12:17, 5 May 2012

HOME Directories

All user HOME directories for every compute node of the cluster are located on a shared RAID system. The compute nodes and login node (frontend) have the HOME directories mounted via NFS. On every node of the cluster the path to your HOME is the same. The filesystem space on HOME is limited by a quota! Due to the limited network performance, the HOME filesystem is not intended for fast I/O and for large files! To read or write even small files from many nodes (> 200) will cause trouble for all users. Applications should designate a single process to do the read and use broadcast mechanism (e.g. MPI_Bcast) to all nodes or use an parallel IO-mechanism like MPI-IO.

SCRATCH directories

For large files and fast I/O, please use

lustre

It's a fast distributed cluster filesystem using the high speed network infrastructure (Gemini). This filesystem is available on all nodes and on the frontend/login nodes.

You are responsible to obtain it from the system. To get access to this global scratch filesystems you have to use the [workspace mechanism]. Please notice, there is a maximum time limit for each workspace (30 days). After a workspace has exceeded the time limit, the workspace directory will be deleted.

The available storage capacity of about 2.5PB has been cut into five areas:

Name	usage	restricted
Univ_1	general use	no
Univ_2	only few files per user (~1000)	no
Res_1	reserved for demanding projects	yes
Res_2	reserved for demanding projects	yes
Ind_2	shared acces with Viz cluster @ hlrs	yes

The reason to split the Univ_1/2 filesystems is the time needed to recover from a failure. In case of a failure which requires a filesystem check , the time needed for this task is related to the number of files stored on the filesystem. To minimize the downtime after such a failure we separate users with a few (and probably large files only) to start production of the Cray system soon, while the group using the filesystem with about 100Mio Files will have to wait a couple of days (weeks) to become online again. Users / Groups may have access to Univ_1 or Univ_2 but not both.

Filesystem Policy

IMPORTANT! NO BACKUP!! There is NO backup done of any user data located on HWW Cluster systems. The only protection of your data is the redundant disk subsystem. This RAID system is able to handle a failure of one component. There is NO way to recover inadvertently removed data. Users have to backup critical data on their local site!

For data which should be available longer than the workspace time limit allowed and for very important data storage, please use the High Performance Storage System HPSS

@@ Line 18: / Line 18: @@
 !Name
 !usage
-!restriced
+!restricted
 |-
 |Univ_1

CRAY XE6 Disk Storage: Difference between revisions

Revision as of 12:17, 5 May 2012

HOME Directories

SCRATCH directories

Filesystem Policy

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools