...
Remote submission of IMAS HPC workflows may become a problem given the requirement for an IMAS environment installed as well as the wide amount of supercomputers availablebe challenging due to main factors: connecting and communicating with heterogenous supercomputers, and being able to run the IMAS environment inside a supercomputer within the user space. To approach these issues, an approach with virtualized environment with IMAS installed and a remote submission system has been designed.
The use case considered is: an IMAS developer working on a environment with IMAS installed . These environments wants to run an IMAS workflow on a big supercomputer. The environment were the developer will be working will be mainly the Marconi Gateway paritiond, ITER cluster and and occasionally a local user computer. From there, the workflow will be submitted from a remote supercomputer (mainly PRACE HPC facilities).
Getting started
This tutorial describes the steps to submit an example uDocker IMAS workflow workflow image to a remote supercomputer. To do so we will make usage of uDocker and SUMI Fig. 1 describes the scheme of the methodology developed to submit remote workflows. Working on the Gateway, ITER cluster or local machine, we configure SUMI tools (SUbmission Manager for IMAS) .
These work on different sides of the system: local machine and remote HPC system:
- Connect from a local computer to a remote cluster to submit my workflow: IMAS
- Bring IMAS environment to supercomputer heterogenous systems: uDocker image
This tutorial assumes that the user has a function machine with a distribution of GNU/Linux installed.
which allow the submission of jobs to remote queuing systems of supercomputers. These supercomputers will run the IMAS workflow using a Docker image which will be running on top of uDocker.
Figure 1. Scheme for the remote submission of IMAS workflows involving HPC codes.
Getting started
This tutorial describes the steps to submit an example uDocker IMAS workflow workflow image from Gateway or ITER cluster to a remote supercomputer. To do so we will make usage of uDocker and SUMI (SUbmission Manager for IMAS).
These work on different sides of the system: local machine and remote HPC system:
- Connect from a local computer to a remote cluster to submit my workflow: IMAS
- Bring IMAS environment to supercomputer heterogenous systems: uDocker image
The following tutorial has been tested with the following machinesThe following tutorial has been tested with the following machines.
- ITER cluster @ITER
- Marconi @Cineca
- Marconi Gatway @Cineca
- Eagle @PSNC
This The tutorial follows the next steps
...
SUMI is a tool capable of submitting jobs to remote HPC clusters and upload and retrieve data from themand upload and retrieve data from them. The code is available in the following link:
https://bitbucket.org/albertbsc/sumi/
New releases of the code can be found in the following link:
https://bitbucket.org/albertbsc/sumi/downloads/
The following subsections describe how to install it for Gateway and ITER cluster.
Install SUMI on Marconi Gateway
This subsection describes how to install SUMI on Marconi Gateway. If you want to install it on ITER cluster, please move to the following subsection.
In this tutorial the local computer used will be Marconi Gateway. Given that SUMI uses 2.7 python version, first we will load the correspondent module
Code Block | ||
---|---|---|
| ||
module load python/2.7.12 |
SUMI depends on two powerfull powerful libraries to perform its tasks. These are Paramiko (data transfer) and SAGA (remote job submission). To install the dependencies we need to download python libraries, but given that we do not have root permissions we will make usage of a virtual environment which will allow to install these libraries locally. For this purpose we will use "virtualenv". It creates a python environment in local folders and allows to create a virtual python environment where we can install libraries locally. To set it up we run the following commands:
...
Code Block | ||
---|---|---|
| ||
source sumi-virtualenv/bin/activate.csh |
...
Our terminal prompt will now show the folder name in front of out username in the following way:
...
Once the dependencies have been installed, we can download and configure SUMI. To retrieve the code , clone the code from the repositoryrun the following command
Code Block |
---|
wget -qO- https://bitbucket.org/albertbsc/sumi/downloads/sumi-0.1.0.tar.gz | tar xvz |
...
Code Block | ||
---|---|---|
| ||
setenv PATH $PATH\:$HOME/sumi/bin |
For Bash shells
Code Block | ||
---|---|---|
| ||
export PATH=$PATH:$PWD/sumi/bin/ |
SUMI requires two configuration job files which contain the information of the jobs to be sumitted submitted and the HPC cluster where we are going to submit. For this we need to create the configuration folder and copy the configuration files jobs.conf and servers.conf from sumi/conf/ directory.
...
Now, we are ready to run SUMI. Execute the option "-h" to see all the options
Code Block |
---|
$ sumi -h usage: sumi.py [-h] [-r] [-u UP] [-d DOWN] [-m MACHINE [MACHINE ...]] [-j JOBS [JOBS ...]] Submission Manager for IMAS optional arguments: -h, --help show this help message and exit -r, --run Run the configured job on a specific cluster -u UP, --upload UP Upload local file -d DOWN, --download DOWN Download remote file -m MACHINE [MACHINE ...], --machine MACHINE [MACHINE ...] List of machines were to submit -j JOBS [JOBS ...], --job JOBS [JOBS ...] List of jobs to be submitted |
...
Code Block | ||
---|---|---|
| ||
module load python/2.7.15 |
SUMI depends on two powerfull powerful libraries to perform its tasks. These are Paramiko (data transfer) and SAGA (remote job submission). To install the dependencies we need to download python libraries, but given that we do not have root permissions we will make usage of a virtual environment which will allow to install these libraries locally. For this purpose we will use "virtualenv". It creates a python environment in local folders and allows to create a virtual python environment where we can install libraries locally. To set it up we run the following commands:To to download python libraries, but given that we do not have root permissions we will make usage of the option "–user" for pip, which will install these libraries locally. To install the dependencies, now we can run the "pip" command which will install the python libraries in our local virtualenv
Code Block |
---|
pip install saga-python==0.50.01 --user pip install paramiko --user |
Once the dependencies have been installed, we can download and configure SUMI. To retrieve the code , clone the code from the repositoryrun the following command
Code Block |
---|
wget -qO- https://bitbucket.org/albertbsc/sumi/downloads/sumi-0.1.0.tar.gz | tar xvz |
...
SUMI requires two configuration job files which contain the information of the jobs to be sumitted submitted and the HPC cluster where we are going to submit. For this we need to create the configuration folder and copy the configuration files jobs.conf and servers.conf from sumi/conf/ directory.
...
Passwordless configuration with Marconi
To use SUMI Because SUMI uses SSH channels to submit jobs and to transfer data we need a passwordless connection . This means that to avoid the password requirement for every step. To achieve so the targeted HPC machine needs to have the public key of our account at Gateway on the list of allowed keys. If we don't have a ~/.ssh/id_rsa.pub file * file in our Gateway/ITER account we have to generate our keys
Code Block | ||
---|---|---|
| ||
ssh-keygen |
This will generate ~/.ssh/id_rsa files. To copy our public key in the list of allowed connections we will make usage of the "ssh-copy-id" command to copy from Gateway to Marconi
Code Block | ||
---|---|---|
| ||
ssh-copy-id username@login.marconi.cineca.it |
Now, if we stablish establish an SSH connection, the prompt will not ask for any password and will give us a terminal inside Marconi.
...
Known issues
When setting up a uDocker Docker image locally be sure that there is enough quota for your user. Otherwise it may crash or show untar related problems as the following when running the udocker load command.
Error: failed to extract container:
Error: loading failed
...
The Docker image is avaiable at Gateway and you can use it on any machine, including your own laptop
Code Block |
---|
ls ~g2tomz/public/imas-installer* |
...
Code Block |
---|
$HOME/.local/bin/udocker load -i imas-installer-20180921112143.tar.xz $HOME/.local/bin/udocker create --name=imas imas-installer:20180921112143 |
Test image on Marconi
Once the image has been loaded we can interact with it and get a bash inside the container. To interact with the image we can get a bash with the following command
...
Code Block |
---|
module load imas kepler module load keplerdir imasdb test export USER=imas kepler -runwf -nogui -user imas /home/imas/simple-workflow.xml |
Running these commands inside the terminal will make the workflow start running. This will mean that we will be running a Kepler workflow inside a PRACE machine which does not have the IMAS environment installed. But we will be running inside the Docker image. This is a sample of the output:
Configure and submit workflow, then check output on GW
...