- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

Urika GX: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
(Created page with "= Urika GX = Oficial Documentation: * [https://pubs.cray.com/content/S-3017/2.2.UP00/urika-gx-system-overview Hardware overview] * [https://pubs.cray.com/content/S-3017/2.2....")
(No difference)

Revision as of 14:21, 14 June 2019

Urika GX

Oficial Documentation:

Hardware

The system consists of two subsystems: Enkidu and Gilgamesch Nodes are interconnected with Aries network, and two Ethernet networks: "Operational" (not configured by default) and "Management"

Compute (and Service) nodes have extra storage: * 2TB HDD /mnt/hdd-2/ * 1,6 TB SSD /mnt/ssd/ and /mnt/ssd2/

Both are used for Spark cache and HDFS.

Enkidu

  • 16 Nodes in total
  • 2 Login
  • 2 IO
  • 3 Service
  • 9 Compute

Gilgamesch

  • 48 Nodes in total
  • 2 Login
  • 2 IO
  • 3 Service
  • 41 Compute

Workflow/Software

SSH

ssh access from "Mitarbeiter"-Network and Uni-VPN network:

  • ssh enkidu-login1.hlrs.de
  • ssh gilgamesch-login1.hlrs.de

You can download a VPN Client here.

Recommended ssh config (~/.ssh/config):

Host gilgamesch
    Hostname gilgamesch-login1.hlrs.de
    User hpcXXXXX
    DynamicForward 8080
Host enkidu
    Hostname enkidu-login1.hlrs.de
    User hpcXXXXX
    DynamicForward 8080

Change your password after first login:

module load tools/ldap
passwd

Create (if you don't have any) ssh-keys on your local PC: ssh-keygen

Copy your public key to Urika (change your default password first!):

ssh-copy-id gilgamesch
ssh-copy-id enkidu

Useful Modules

  • module load tools/ldap - makes custom passwd command available for changing own password
  • module load tools/mesos - provides minfo and mreserve commands to work with Mesos.
  • module load tools/proxy - configures proxy settings to use Internet (http, https, ftp). Don't forget to remove this module when you do not need any Internet access, as this is a potetntial security issue.
  • module load tools/forceproxy - forwards all "external" traffic to go through the proxy. This is helpful for application without proxy support. Don't forget to remove this module when you do not need any Internet access, as this is a potential security issue.
  • module load python/3.6 - custom Python with Jupyter and HLRS Kernel.

Mesos tools

Check resources usage:

module load tools/mesos
minfo

This will show you a list off all compute nodes and jobs running on them. You will get an error message if the system runs in Secure mode or being maintained.

For debug purposes there is mreserve script, allowing you to exclusively reserve certain compute nodes.

Following example reserves nid00001 and nid00002 (as soon as any of them is free) for 15 minutes with the job name Maintenance. You may need to use it with tmux or screen, because mreserve reserves nodes until timeout expires, reservation task dies or user preses Ctrl+C.

module load tools/mesos
mreserve Maintenance 15 nid00001 nid00002

Using Web GUIs

SSH-Config above has DynamicForward options (-D parameter if you do not use the config). This allows access to local Urika Services over socks-proxy.

For that I recommend using Firefox and creating a new profile.

  • Open about:profiles, create and launch a new profile.
  • In the new profile open about:preferences and search fo proxy
  • Enter following settings (as shown below):
    • Manual proxy configuration
    • SOCKS Host: 127.0.0.1
    • SOCKS Port: 8080
    • SOCKS v5
    • No proxy for - leave the field empty
    • Proxy DNS

Starting from Firefox v67, you have to do this as well:

  • Open about:config
  • Find network.proxy.allow_hijacking_localhost and set it to true (Do NOT do this in your default Firefox profile! This may cause security issues when using proxies!)
Firefox proxy settings

Jupyter Hub

Urika GX is shipped with Jupyter and Jupyter Hub. Create an ssh connection with ssh_config previously and configure Firefox proxy settings, then open http://login1:7800/. After logging in a copy of Jupyter will be started for you.

Important notice. Don't forget to terminate any Kernels after you finish, otherwise they may keep resources reserved, and you will be billed for these resources.

Custom Jupyter on login node

cd
module load python/3.6
jupyter notebook --no-browser

This will print you a link to open in Firefox (with proxy). Do not share this link with others and do not disable token/password authentication!

Create a new notebook with HLRS Python 3.6 kernel or change a Kernel of an existing notebook. This Kernel supports some extra magics and has Internet enabled for you to be able to install pip packages.

Important notice. Don't forget to terminate any Kernels after you finish, otherwise they may keep resources reserved, and you will be billed for these resources.

Custom Jupyter on compute node

Everything is almost like in the previous step. Start command:

cd
module load tools/mesos
mnotebook

This will print you a link to open in Firefox (with proxy). Do not share this link with others and do not disable token/password authentication!

Create a new notebook with HLRS Python 3.6 kernel or change a Kernel of an existing notebook. This Kernel supports some extra magics. There is no Internet acces on compute nodes.

Important notice. Don't forget to terminate any Kernels after you finish, otherwise they may keep resources reserved, and you will be billed for these resources.

Special magics

%spark / %%spark

  • %spark - will start a spark session with default values: --name "Jupyter PySpark" --total-executor-cores 36 --executor-memory 1g

  • %spark 180 450g MyTask - will start a park session overwriting default values

  • %spark 180 450g MyTask --driver-memory 100g --packages p1,p2 - will start a park session overwriting default values and adding extra arguments

  • Last magic in multi-line format:

    %%spark 180 450g MyTask
    --driver-memory 100g
    --packages p1,p2

%spark_nohive / %%spark_nohive

Work like %spark and %%spark magics but don't load Hive support. May be useful for debugging

%project

  • %project project_name - will create ~/.pyprojects/project_name path for pip packages. environment variable and sys.path are updated for making it possible to install custom packages.

After calling this magic !pip install and !pip install --user will both install packages into sub-folders of ~/.pyprojects/project_name. User packages will have priority before system ones (usually this is not the case in Python). !pip uninstall -y first uninstalls a user package if any. It fails if trying to uninstall a system package.

Known issues:

  • some packages may not be installed in this mode. You can still try installing them with !pip install --user without %project magic;
  • package installation from Internet is available only on the login1 node.