NEC Cluster Disk Storage (vulcan): Difference between revisions

Latest revision as of 13:02, 15 December 2023

HOME Directories

Users' HOME directories are located on a shared RAID system and are mounted via NFS on all login (frontend) and compute nodes. The path to the HOME directories is consistent across all nodes. The filesystem space on HOME is limited by a quota (50GB per user, 200GB per group) and will be shared with the resources hawk, vulcan, vulcan2! The quota usage for your account and your groups can be listed by command na_quota available on the login nodes.

Warning: Due to the limited network performance, the HOME filesystem is not intended for large files and fast I/O! Do not read or write files from many nodes (>200) as this will cause trouble for all users. Use single read process + Bcast approach or MPI-IO instead.

In addition, a project volume can be made available on request (see Project_filesystem)

SCRATCH directories

For large files and fast I/O, please use

lustre

The capacity on vulcan is ~2.2 PByte. The system consists of 2 MDS servers, 2 OSS servers and 16 OST storage targets.
The capacity on vulcan2 is ~500TByte, the bandwith is about 6 GB/sec. The system consists of 2 MDS servers, 2 OSS servers and 16 OST storage targets, each of them one RAID6 lun, 8+PQ, 4 TB disks.

Scratch directories are available on all compute and login (frontend) nodes via the workspace mechanism.

Note: To get the best performance using MPI-IO it may be necessary to use tune the file distribution

Warning: Worspaces have a restriction: There is a maximum time limit for each workspace (60 days) after which they will be deleted automatically.

localscratch

Some special node types (see motd at login session or documentation node_types) have a local disk installed and mounted on /localscratch. On this nodes:

each batchjob creates a /localscratch/$PBS_JOBID directory owned by the job owner.
each ssh login session creates a /localscratch/$UID directory owned by $UID.
at the end of a user batch job, the directory $PBS_JOBID in /localscratch on the node will be removed!
at the end of all login sessions of a user on a node, the $UID directorie in /localscratch will be removed!
the individual /localscratch filesystem on the nodes is not shared with other nodes.

Filesystem Policy

IMPORTANT! NO BACKUP!! There is NO backup done of any user data located on HWW Cluster systems. The only protection of your data is the redundant disk subsystem. This RAID system is able to handle a failure of one component. There is NO way to recover inadvertently removed data. Users have to backup critical data on their local site!

@@ Line 2: / Line 2: @@
 Users' HOME directories are located on a shared RAID system and are mounted via NFS on all login (frontend) and compute nodes.
-The path to the HOME directories is consistent across all nodes. The filesystem space on HOME is limited by a quota and will be shared with the resources hermit, hornet, laki, laki2!
+The path to the HOME directories is consistent across all nodes. The filesystem space on HOME is limited by a quota (50GB per user, 200GB per group) and will be shared with the resources hawk, vulcan, vulcan2! The quota usage for your account and your groups can be listed by command <tt>na_quota</tt> available on the login nodes.
 {{Warning|text=Due to the limited network performance, the HOME filesystem is not intended for large files and fast I/O! Do not read or write files from many nodes (>200) as this will cause trouble for all users. Use single read process + Bcast approach or [[MPI-IO]] instead.}}
+In addition, a '''project''' volume can be made available on request (see [[Project_filesystem]])
 === SCRATCH directories ===
@@ Line 11: / Line 13: @@
 <ul>
   It's a fast distributed cluster filesystem using the infiniband network infrastructure. This filesystem is available on alle nodes and on the frontend/loging nodes.
-* The capacity on laki is ~240TByte, the bandwith is about 4 GB/sec. The system consists of 2 MDS servers, 4 OSS servers and 24 OST storage targets, each of them one RAID6 lun, 8+PQ, 12 OSTs of 1TB and 12 OSTs of 2 TB disks.
+* The capacity on vulcan is ~2.2 PByte. The system consists of 2 MDS servers, 2 OSS servers and 16 OST storage targets.
-* The capacity on laki2 is ~110TByte, the bandwith is about 3 GB/sec. The system consists of 2 MDS servers, 2 OSS servers and 8 OST storage targets, each of them one RAID6 lun,  8+PQ, 2 TB disks.
+* The capacity on vulcan2 is ~500TByte, the bandwith is about 6 GB/sec. The system consists of 2 MDS servers, 2 OSS servers and 16 OST storage targets, each of them one RAID6 lun,  8+PQ, 4 TB disks.
 </ul>
 Scratch directories are available on all compute and login (frontend) nodes via the [[workspace mechanism]].
 {{Note|text=To get the best performance using [[MPI-IO]] it may be necessary to  use tune the file distribution }}
-{{Warning|text=Worspaces have a restriction: There is a '''maximum time limit''' for each workspace (30 days) after which they will be '''deleted automatically'''.}}
+{{Warning|text=Worspaces have a restriction: There is a '''maximum time limit''' for each workspace (60 days) after which they will be '''deleted automatically'''.}}
+=== localscratch ===
+Some special node types (see motd at login session or documentation [[Batch_System_PBSPro_(vulcan)#Node_types | node_types]]) have a local disk installed and mounted on /localscratch. On this nodes:
+* each batchjob creates a /localscratch/$PBS_JOBID directory owned by the job owner.
+* each ssh login session creates a /localscratch/$UID directory owned by $UID.
+* at the end of a user batch job, the directory $PBS_JOBID in /localscratch on the node will be removed!
+* at the end of all login sessions of a user on a node, the $UID directorie in /localscratch will be removed!
+* the individual /localscratch filesystem on the nodes is not shared with other nodes.
 === Filesystem Policy ===
 IMPORTANT! NO BACKUP!! There is NO backup done of any user data located on HWW Cluster systems. The only protection of your data is the redundant disk subsystem. This RAID system is able to handle a failure of one component. There is NO way to recover inadvertently removed data. Users have to backup critical data on their local site!

NEC Cluster Disk Storage (vulcan): Difference between revisions

Latest revision as of 13:02, 15 December 2023

Contents

HOME Directories

SCRATCH directories

localscratch

Filesystem Policy

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools