- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -
Data Transfer with UFTP: Difference between revisions
Line 49: | Line 49: | ||
* For easier use you can create either an alias or adjust your $PATH variable in your ~/.bashrc | * For easier use you can create either an alias or adjust your $PATH variable in your ~/.bashrc | ||
** <code>alias uftp=/<UFTP_CLIENT_INSTALL_DIR>/bin/uftp</code> | ** <code>alias uftp=/<UFTP_CLIENT_INSTALL_DIR>/bin/uftp</code> | ||
** <code>export PATH=$PATH:/<UFTP_CLIENT_INSTALL_DIR>/bin</code> | ** <code>export PATH=$PATH:/<UFTP_CLIENT_INSTALL_DIR>/bin/</code> | ||
= UFTP detailed = | = UFTP detailed = |
Revision as of 13:16, 22 January 2020
Currently in testing, not production ready, so
UFTP Service
UFTP is a data streaming library and file transfer tool. It is integrated into UNICORE, allowing to transfer data from client to server (and vice versa), as well as providing data staging between UFTP-enabled UNICORE sites. UFTP can also be used independently from UNICORE, requiring a authentication server and a standalone UFTP client. You can install the standalone client in your desktop and use it for copying data from or to the file systems of the Supercomputers
Authentication server: gridftp-fr1.hww.hlrs.de
Port: 9000
Common used command examples (unfinished)
Additional command line options (unfinished)
Setup at HLRS
By design you need to have console access to a system where either the source or destination filesystem is directly mounted/accessible. The other side needs to have a running UFTPD server with access to the destination/source filesystem.
Here at HLRS we have 5 data transfer nodes (data transfer backends/BE) available and two authentication server (data transfer frontends/FE). The main systen is gridftp-fr1 using the majority of BEs for transfers and increased bandwidth usage.
Possible scenarios (unfinished)
Since pushing and pulling data is possible, it does not matter on which site of the transfer you are logged in.
- Upload/download from/to a your home site (normally not running UFTP servers) to HLRS
- Install the UFTP client or use a preinstalled client at your site
- Upload/download from a UFTP enabled site (like GCS Centers) from/to HLRS
You have then two options
- Initiating the transfer from the remote site
- Same as for your home site, but the client is already installed
- Initiating the transfer from HLRS (not yet ready) to remote site
- Possible options, not implemented yet
Login in to our data transfer node where client tools are already installedRun the client from the Cluster frontend
- Initiating the transfer from the remote site
UFTP client
The UFTP standalone Client is a Java-based client for UFTP and runs under Linux. It allows to list remote directories, copy files (with many options such as wildcards), and sync single files. It supports username/password authentication and ssh-key authentication to a UFTP Authentication Server.
Installation
First you need to have a working Java 8 Runtime Environment The Unicore download page contains all components for Unicore. Unless you want to use the full Unicore potential, the UFTP Client is enough and you can use the direct link to sourceforge. The directory for a version contains three installation methods
- RPM based version (for Redhat based systems, also CentOS, Scientific Linux, ...)
- DEB based version (for Debian based systems, also Ubuntu, ...)
- zip Archive (for all systems including above mentioned ones)
You can use any of the listed installation methods. Since RPM and DEB should be clear, we explain only the manual installation (Detailed instrution can be found here)
- Download the newest client uftp-client-x.x.x-all.zip file
- Extract it where ever you want
unzip uftp-client-<uftp_client_version>-all.zip -d <UFTP_CLIENT_INSTALL_DIR/>
- run the client to verify proper environment
</UFTP_CLIENT_INSTALL_DIR>/bin/uftp -h
- If this works, you can start to transfer data
- For easier use you can create either an alias or adjust your $PATH variable in your ~/.bashrc
alias uftp=/<UFTP_CLIENT_INSTALL_DIR>/bin/uftp
export PATH=$PATH:/<UFTP_CLIENT_INSTALL_DIR>/bin/
UFTP detailed
Features
- dynamic firewall port opening using a pseudo FTP connection. UFTPD requires only a single open port.
- optional encryption of the data streams using a symmetric key algorithm
- optional compression of the data streams (using gzip)
- partial reads/writes to a file. If supported by the filesystem, multiple UFTP processes can thus read/write a file in parallel (striping)
- supports efficient synchronization of single local and remote files using the rsync algorithm
- integrated into UNICORE clients for fast file upload and download
- integrated with UNICORE servers for fast data staging and server-to-server file transfers
- standalone (non-UNICORE) client available
How does UFTP works?
The server part, called uftpd, listens on two ports (which may be on two different network interfaces):
- the command port receives control commands (for connections from authentication server)
- the listen port accepts data connections from clients.
The uftpd server is "controlled" (usually by UNICORE/X) via the command port, and receives/sends data directly from/to a user’s client machine or another UFTP enabled UNICORE server. Data connnections are made to the "listen" port, which has to be accessible from external machines. Firewalls have to treat the "listen" port as an FTP port. A UFTP file transfer works as follows:
- the UNICORE/X server (or authentication server) sends a request to the command port. This request notifies the UFTPD server about the upcoming transfer and contains the following information
- the client’s IP address
- the source/target file name
- whether to send or receive data
- a "secret", i.e. a string the client will send to authenticate itself
- how many data connections will be opened
- the user and group id for who to create the file (in case of send mode)
- an optional key to encrypt/decrypt the data
- the UFTPD server will now accept an incoming connection from the announced IP address, provided the supplied "secret" matches the expectation.
- if everything is OK, the requested number of data connections from the client can be opened. Firewall transversal will be negotiated using a pseudo FTP protocol.
- the file is sent/received using the requested number of data connections
- to access the requested file, uftpd attempts to switch its user id to the requested one prior to reading/writing the file. This uses a C library which is accessed from Java via the Java native interface (JNI). See also the installation section below.