- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -

Phoenix Introduction

From HLRS Platforms

Hardware and Architecture

The HWW Opteron Cluster phoenix platform consists of one front node for interactive access (phoenix.hww.de), serveral compute nodes for execution of parallel programs and some service nodes.

Different compute node types are installed:

  1. AMD Opteron 246 dual socket single core, 2GHz, 4GB memory, Myrinet, (queue/property: myrinet/old)
  2. AMD Opteron 2220 dual socket dual core, 2.8GHz, 8GB memory, GigE, ASUS KFN4-DRE mainboard, (queue/property: normal/kfn4dre)
  3. AMD Opteron dual socket dual core, 2.8GHz, 8GB memory, GigE, (queue/property: normal/rs161e4)
  4. AMD Opteron 2222 dual socket dual core, 3.0GHz, 8GB/16GB/32GB/64GB memory, GigE, (queue/property: normal/rs161e5)
  5. AMD Opteron dual socket dual core, 2.6GHz, 8GB memory, Infinipath, (queue/property: iband/rs161e5)
  6. AMD Opteron dual socket dual core, 3.0GHz, 8GB memory, Infiniband, (queue/property: ipath/dl145g3)


Features

  • Operating System: ScientificLinux 5 on AMD Opteron based nodes, diskless
  • Batchsystem: Torque/Maui
  • node-node interconnect: Infinipath, Infiniband, GigE, Myrinet
  • Local Disk for scratch on some nodes
  • Disk > 17 TB
  • Lustre 2.7 TB


Short overview of installed compute nodes
Type Freq Cores Memory Disk Interconnect PBS Queue PBS properties Nodes Number
1 2.0 GHz 2*1= 2 4GB 30GB GigE/Myrinet myri old c0-01 - c3-31 118
2 2.8 GHz 2*2= 4 8GB - GigE normal kfn4dre c4-01 - c4-08 8
3 2.8 GHz 2*2= 4 8GB - GigE normal rs161e4 c4-09 - c4-16 8
4 3.0 GHz 2*2= 4 8GB/16GB/32GB/64GB - GigE normal rs161e5 c4-17 - c4-22, c6-01 - c6-34, c5-11 - c5-36 82
5 2.6 GHz 2*2= 4 8GB 70GB GigE, Infinipath ipath dl145g3 c5-01 - c5-10 10
6 3.0 GHz 2*2= 4 8GB - GigE, Infiniband iband rs161e5 c6-27 - c6-32 6

Access

The only way to access gerris.hww.de (frontend node of HWW Opteron Cluster phoenix) from outside is through Secure Shell (ssh)

Usage

The frontend node

gerris.hww.de

is intended as single point of access to the entire cluster. Here you can set your environment, move your data, edit and compile your programs and create batch scripts. Any interactive usage of the frontend node which causes a high cpu/memory load are NOT allowed (production runs).

The compute nodes for running parallel jobs are available only through the Batch system on the frontend node.



HOME Directories


All user HOME directories for every compute node of the cluster are located on the I/O Servers. The compute nodes and login node (frontend) have the HOME directories mounted via NFS. On every node of the cluster the path to your HOME is the same. The filesystem space on HOME is limited by a quota of 50MB! Please note the Filesystem Policy!

SCRATCH directories


Local Scratch

When allocating nodes with local disks (see table) using the batch queuing system (Torque), the /tmp on the compute nodes can be used as scratch. After your batch jobs are finished, the /tmp will be cleaned automatically.

Global Scratch

Another scratch you can get are global space on shared filesystems. There are 3 globel shared filesystems available on phoenix:

  • default
    It's a filesystem which is available via NFS on all phoenix compute nodes and on the phoenix frontend system.
    Capacity: 10TB
  • scratch
    It's a filesystem which is available via NFS on all phoenix compute nodes and on the phoenix frontend system.
    Capacity: 1TB
  • lustre
    It's a parallel distributed filesystem using a Gigabit Ethernet interface.
    Capacity: 2.7TB


You are responsible to obtain it from the system. To get access to this global scratch filesystems you have to use the workspace mechanism.

Environment Settings


In order to use some software features like special MPI versions, or Compilers, you have to perform some environmental settings. To modify the default HLRS PHOENIX environmental settings on login, you can create a file $HOME/.profile which contains your own envirenmental settings. The login shell is a bash shell which reads $HOME/.profile during login!


Environment Settings using command module

The environmental setting using this methode will not be saved and will be lost for a new session. A new session (login, new job) will have the default HWW environment. The Cluster system uses modules in the user environment to support multiple versions of software, such as compilers, and to create integrated software packages. As new versions of the supported software become available, they are added automatically to the programming environment, while earlier versions are retained to support legacy applications. By specifying the module to load, you can choose the default version of an application, or another version. Modules also porvide a simple mechanism for updating certain environment variables, such as PATH, MANPATH, and LD_LIBRARY_PATH. The following topics describe the fundamentals of using the modules environment.


  • to invoke the module command, type:
    module option args
    module help modulecommand
    The help command will provide more detailed information on the specified module. Without argument modulecommand you will get online help for the module command.
    module avail
    The avail option displays all the modules that are available on the system. Where there is more than one version of a module, the default version is denoted by (default).
    module list
    The list option displays all the modules that are currently loaded into your user environment.
    module add / module load modulename
    The add option and the load option have the same function - to load the specified module into your user environment.
    module rm / module unload modulename
    The rm option and the unload option have the same function - to unload the specified module from your user environment. Before loading a module that replaces another version of the same package, you should always unload the module that is to be replaced.
    module display modulename
    The display option shows the changes that the specified module will make in your environment, for example, what will be added to the PATH and MANPATH environment variables.
    module switch modulename/currentversionmodulename/newversion
    The switch option replaces the currently loaded version of a module with a different version. When the new verion is loaded, the man page for the specified software will also be updated.


  • using $HOME/.modulerc
    This file can be used to load or to define your own environment during each login. An example looks like this:
    #%Module1.0#
    
    set version 1.0
    module load use.own
    

    The module use.own will add $HOME/privatemodules to the list of directories that the module command will search for modules. Place your own module files here. This module, when loaded, will create this directory if necessary.

    see also:

     man module
    


Filesystem Policy

IMPORTANT! NO BACKUP!! There is NO backup done of any user data located on HWW systems. The only protection of your data is the redundant disk subsystem. This RAID system (Raid5) is able to handle a failure of one component (e.g. a single disk or a controller). There is NO way to recover inadvertently removed data. Users have to backup critical data on their local site!

Support / Feedback

Please report all problems to:


  • System Administrators
    1. Thomas Beisel
    2. Bernd Krischok
    3. H. Ohno



  • Applications
    1. Martin Bernreuther