- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

How to use Conda environments on the clusters

From HLRS Platforms
Revision as of 11:02, 10 July 2023 by Hpckkaya (talk | contribs) (Inserted link)
Jump to navigationJump to search

This guide shows you how to move conda environments to move the local environment on the clusters without internet access.

Assumptions:

  • You already have a virtual environment called my_env and you want to move this environment.
  • The environment can have packages installed with conda and pip.

Warning: Conda/pip downloads and installs precompiled binaries suitable to the architecture and OS of the local environment and might compile from source when necessary for the local architecture. These packages will run differently for the target system.

Using conda-pack

Install conda-pack in the base or root environment:

(my_env) $ conda deactivate
(base) $ conda install -c conda-forge conda-pack

Package the environment and transfer the archive to the clusters (e.g., scp):

(base) $ conda pack -n my_env -o my_env.tar.gz

Work interactively on a single node

A large number of files decreases the performance of the parallel file system. You can use the ram disk instead:

qsub -I -l select=1:node_type=clx-25 -l walltime=00:30:00 # modify this line to work on Hawk, or to select different resources

export ENV_PATH=/run/user/$PBS_JOBID/my_env # We use the ram disk to extract the environment packages since a large number of files decreases the performance of the parallel file system.
export ENV_ARCHIVE=/path/to/my_env.tar.gz # Adjust the path.

mkdir -p $ENV_PATH

tar -xzf ENV_ARCHIVE -C $ENV_PATH # This line extracts the packages to ram disk.

source $ENV_PATH/bin/activate

conda-unpack

# Use the environment here.

rm -rf $ENV_PATH # It's nice to clean up before you terminate the job.

Use the environment on multiple nodes

Preparing a batch script to launch a multi-node job would be best. The steps to start a distributed Python application on multiple nodes depend on the third-party library (e.g., Dask, Ray). Independent of the third-party library, if you want to continue using the ram disk to unpack the virtual environment, you must extract the archive on each node separately. Our documentation provides scripts to Prepare_scripts_to_launch_a_Ray_cluster_on_the_compute_nodes|launch a Ray cluster using conda virtual environments.