- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

Data Transfer with UFTP: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
Line 43: Line 43:
* Download the newest client uftp-client-x.x.x-all.zip file
* Download the newest client uftp-client-x.x.x-all.zip file
* Extract it where ever you want
* Extract it where ever you want
<code>unzip uftp-client-<uftp_client_version>-all.zip -d <UFTP_CLIENT_INSTALL_DIR/></code>
**<code>unzip uftp-client-<uftp_client_version>-all.zip -d <UFTP_CLIENT_INSTALL_DIR/></code>
* run the client to verify proper environment
* run the client to verify proper environment
<code></UFTP_CLIENT_INSTALL_DIR>/bin/uftp -h</code>
**<code></UFTP_CLIENT_INSTALL_DIR>/bin/uftp -h</code>
* If this works, you can start to transfer data
* If this works, you can start to transfer data
* For easier use you can create either an alias or adjust your $PATH variable in your ~/.bashrc
* For easier use you can create either an alias or adjust your $PATH variable in your ~/.bashrc

Revision as of 16:25, 21 January 2020

Currently in testing, not production ready, so

UFTP Service

UFTP is a data streaming library and file transfer tool. It is integrated into UNICORE, allowing to transfer data from client to server (and vice versa), as well as providing data staging between UFTP-enabled UNICORE sites. UFTP can also be used independently from UNICORE, requiring a authentication server and a standalone UFTP client. You can install the standalone client in your desktop and use it for copying data from or to the file systems of the Supercomputers

Authentication server: gridftp-fr1.hww.hlrs.de

Port: 9000

Common used command examples (unfinished)

Additional command line options (unfinished)

Setup at HLRS

By design you need to have console access to a system where either the source or destination filesystem is directly mounted/accessible. The other side needs to have a running UFTPD server with access to the destination/source filesystem.

Here at HLRS we have 5 data transfer nodes (data transfer backends/BE) available and two authentication server (data transfer frontends/FE). The main systen is gridftp-fr1 using the majority of BEs for transfers and increased bandwidth usage.

Possible scenarios (unfinished)

Since pushing and pulling data is possible, it does not matter on which site of the transfer you are logged in.

  • Upload/download from/to a your home site (normally not running UFTP servers) to HLRS
    • Install the UFTP client or use a preinstalled client at your site
  • Upload/download from a UFTP enabled site (like GCS Centers) from/to HLRS

You have then two options

    • Initiating the transfer from the remote site
      • Same as for your home site, but the client is already installed
    • Initiating the transfer from HLRS (not yet ready) to remote site
      • Possible options, not implemented yet
      • Login in to our data transfer node where client tools are already installed
      • Run the client from the Cluster frontend

UFTP client (unfinished)

The UFTP standalone Client is a Java-based client for UFTP and runs under Linux. It allows to list remote directories, copy files (with many options such as wildcards), and sync single files. It supports username/password authentication and ssh-key authentication to a UFTP Authentication Server.

Installation

First you need to have a working Java 8 Runtime Environment The Unicore download page contains all components for Unicore. Unless you want to use the full Unicore potential, the UFTP Client is enough and you can use the direct link to sourceforge. The directory for a version contains three installation methods

  • RPM based version (for Redhat based systems, also CentOS, Scientific Linux, ...)
  • DEB based version (for Debian based systems, also Ubuntu, ...)
  • zip Archive (for all systems including above mentioned ones)

You can use any of the listed installation methods. Since RPM and DEB should be clear, we explain only the manual installation (Detailed instrution can be found here)

  • Download the newest client uftp-client-x.x.x-all.zip file
  • Extract it where ever you want
    • unzip uftp-client-<uftp_client_version>-all.zip -d <UFTP_CLIENT_INSTALL_DIR/>
  • run the client to verify proper environment
    • </UFTP_CLIENT_INSTALL_DIR>/bin/uftp -h
  • If this works, you can start to transfer data
  • For easier use you can create either an alias or adjust your $PATH variable in your ~/.bashrc
    • alias uftp=/<UFTP_CLIENT_INSTALL_DIR>/bin/uftp
    • export PATH=$PATH:/<UFTP_CLIENT_INSTALL_DIR>/bin

UFTP detailed

Features

  • dynamic firewall port opening using a pseudo FTP connection. UFTPD requires only a single open port.
  • optional encryption of the data streams using a symmetric key algorithm
  • optional compression of the data streams (using gzip)
  • partial reads/writes to a file. If supported by the filesystem, multiple UFTP processes can thus read/write a file in parallel (striping)
  • supports efficient synchronization of single local and remote files using the rsync algorithm
  • integrated into UNICORE clients for fast file upload and download
  • integrated with UNICORE servers for fast data staging and server-to-server file transfers
  • standalone (non-UNICORE) client available

How does UFTP works?

The server part, called uftpd, listens on two ports (which may be on two different network interfaces):

  • the command port receives control commands (for connections from authentication server)
  • the listen port accepts data connections from clients.

The uftpd server is "controlled" (usually by UNICORE/X) via the command port, and receives/sends data directly from/to a user’s client machine or another UFTP enabled UNICORE server. Data connnections are made to the "listen" port, which has to be accessible from external machines. Firewalls have to treat the "listen" port as an FTP port. A UFTP file transfer works as follows:

  • the UNICORE/X server (or authentication server) sends a request to the command port. This request notifies the UFTPD server about the upcoming transfer and contains the following information
    • the client’s IP address
    • the source/target file name
    • whether to send or receive data
    • a "secret", i.e. a string the client will send to authenticate itself
    • how many data connections will be opened
    • the user and group id for who to create the file (in case of send mode)
    • an optional key to encrypt/decrypt the data
  • the UFTPD server will now accept an incoming connection from the announced IP address, provided the supplied "secret" matches the expectation.
  • if everything is OK, the requested number of data connections from the client can be opened. Firewall transversal will be negotiated using a pseudo FTP protocol.
  • the file is sent/received using the requested number of data connections
  • to access the requested file, uftpd attempts to switch its user id to the requested one prior to reading/writing the file. This uses a C library which is accessed from Java via the Java native interface (JNI). See also the installation section below.