You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 15 Next »

1. Introduction

Remote submission of IMAS HPC workflows may become a problem given the requirement for an IMAS environment installed as well as the wide amount of supercomputers available. To approach these issues, an approach with virtualized environment with IMAS installed and a remote submission system has been designed.

 

 

 

 

 

2. Getting started

This tutorial describes the steps to submit an example uDocker IMAS workflow workflow image to a remote supercomputer. To do so we will make usage of uDocker and SUMI (SUbmission Manager for IMAS).

These work on different sides of the system: local machine and remote HPC system:

  • Connect from a local computer to a remote cluster to submit my workflow: IMAS
  • Bring IMAS environment to supercomputer heterogenous systems: uDocker image

This tutorial assumes that the user has a function machine with a distribution of GNU/Linux installed.

The following tutorial has been tested with the following machines.

  • ITER cluster @ITER
  • Marconi @Cineca
  • Marconi Gatway @Cineca
  • Eagle @PSNC

This tutorial follows the next steps

  1. Install SUMI.
  2. Passwordless connection with Marconi.
  3. Install uDocker on Marconi
  4. SCP Docker image from Gateway to Marconi and create container
  5. Test image on Marconi.
  6. Configure and submit workflow, then check output on GW.

3. Install SUMI

SUMI is a tool capable of submitting jobs to remote HPC clusters and upload and retrieve data from them.

3.1. Install SUMI on Marconi Gateway

This subsection describes how to install SUMI on Marconi Gateway. If you want to install it on ITER cluster, please move to the following subsection.

In this tutorial the local computer used will be Marconi Gateway. Given that SUMI uses 2.7 python version, first we will load the correspondent module

module load python/2.7.12

SUMI depends on two powerfull libraries to perform its tasks. These are Paramiko (data transfer) and SAGA (remote job submission). To install the dependencies we need to download python libraries, but given that we do not have root permissions we will make usage of a virtual environment which will allow to install these libraries locally. For this purpose we will use "virtualenv". It creates a python environment in local folders and allows to create a virtual python environment where we can install libraries locally. To set it up we run the following commands:

mkdir sumi-virtualenv
virtualenv sumi-virtualenv

Once the virtualenv folder has been configured, we can load the environment.

In the case of TCSH shells as Gateway

source sumi-virtualenv/bin/activate.csh

 

Our terminal prompt will now show the folder name in front of out username in the following way:

[sumi-virtualenv] <g2user@s65

To install the dependencies, now we can run the "pip" command which will install the python libraries in our local virtualenv

pip install saga-python==0.50.01 paramiko

Once the dependencies have been installed, we can download and configure SUMI. To retrieve the code, clone the code from the repository

wget -qO- https://bitbucket.org/albertbsc/sumi/downloads/sumi-0.1.0.tar.gz | tar xvz

This will create a local folder named "sumi". To include it in the $PATH environment variable we need to run the software.

For TCSH shell systems (as Gateway)

setenv PATH $PATH\:$HOME/sumi/bin

For Bash shells

export PATH=$PATH:$PWD/sumi/bin/

SUMI requires two configuration job files which contain the information of the jobs to be sumitted and the HPC cluster where we are going to submit. For this we need to create the configuration folder and copy the configuration files jobs.conf and servers.conf from sumi/conf/ directory.

mkdir $HOME/.sumi
cp sumi/conf/*.conf $HOME/.sumi

Now, we are ready to run SUMI. Execute the option "-h" to see all the

 

$ sumi -h
usage: sumi.py [-h] [-r] [-u UP] [-d DOWN] [-m MACHINE [MACHINE ...]]
               [-j JOBS [JOBS ...]]

Submission Manager for IMAS
optional arguments:
  -h, --help            show this help message and exit
  -r, --run             Run the configured job on a specific cluster
  -u UP, --upload UP    Upload local file
  -d DOWN, --download DOWN
                        Download remote file
  -m MACHINE [MACHINE ...], --machine MACHINE [MACHINE ...]
                        List of machines were to submit
  -j JOBS [JOBS ...], --job JOBS [JOBS ...]
                        List of jobs to be submitted

Then, SUMI has been installed successfully.

3.2. Install SUMI on ITER cluster

This subsection describes how to install SUMI on ITER cluster. If you want to install it on Marconi Gateway, please move to the previous subsection.

In this tutorial the local computer used will be Marconi Gateway. Given that SUMI uses 2.7 python version, first we will load the correspondent module

module load python/2.7.15

SUMI depends on two powerfull libraries to perform its tasks. These are Paramiko (data transfer) and SAGA (remote job submission). To install the dependencies we need to download python libraries, but given that we do not have root permissions we will make usage of a virtual environment which will allow to install these libraries locally. For this purpose we will use "virtualenv". It creates a python environment in local folders and allows to create a virtual python environment where we can install libraries locally. To set it up we run the following commands:

To install the dependencies, now we can run the "pip" command which will install the python libraries in our local virtualenv

pip install saga-python==0.50.01 --user
pip install paramiko --user

Once the dependencies have been installed, we can download and configure SUMI. To retrieve the code, clone the code from the repository

wget -qO- https://bitbucket.org/albertbsc/sumi/downloads/sumi-0.1.0.tar.gz | tar xvz

This will create a local folder named "sumi". To include it in the $PATH environment variable we need to run the software.

For Bash shells

export PATH=$PATH:$PWD/sumi/bin/

SUMI requires two configuration job files which contain the information of the jobs to be sumitted and the HPC cluster where we are going to submit. For this we need to create the configuration folder and copy the configuration files jobs.conf and servers.conf from sumi/conf/ directory.

mkdir $HOME/.sumi
cp sumi/conf/*.conf $HOME/.sumi

Now, we are ready to run SUMI. Execute the option "-h" to see all the

$ sumi -h
usage: sumi.py [-h] [-r] [-u UP] [-d DOWN] [-m MACHINE [MACHINE ...]]
               [-j JOBS [JOBS ...]]

Submission Manager for IMAS
optional arguments:
  -h, --help            show this help message and exit
  -r, --run             Run the configured job on a specific cluster
  -u UP, --upload UP    Upload local file
  -d DOWN, --download DOWN
                        Download remote file
  -m MACHINE [MACHINE ...], --machine MACHINE [MACHINE ...]
                        List of machines were to submit
  -j JOBS [JOBS ...], --job JOBS [JOBS ...]
                        List of jobs to be submitted

Then, SUMI has been installed successfully on ITER cluster.

4. Passwordless configuration with Marconi


To use SUMI we need a passwordless connection. This means that the targeted HPC machine needs to have the public key of our account at Gateway on the list of allowed keys. If we don't have a ~/.ssh/id_rsa.pub file we have to generate our keys

ssh-keygen


To copy our key we make usage of the "ssh-copy-id" command to copy from Gateway to Marconi

ssh-copy-id username@login.marconi.cineca.it

Now, if we stablish an SSH connection, the prompt will not ask for any password and will give us a terminal inside Marconi.

 ssh username@login.marconi.cineca.it

 

5. Install uDocker on Marconi

For this step follow the instructions the will be found on the uDocker documentation.

5.1. Known issues

When setting up a uDocker image locally be sure that there is enough quota for your user. Otherwise it may crash or show untar related problems as the following when running the udocker load command.

    Error: failed to extract container:
    Error: loading failed

6. SCP Docker image from Gateway to Marconi and create container

The Docker image is avaiable at Gateway and you can use it on any machine, including your own laptop

 

ls ~g2tomz/public/imas-installer*

To copy the image from Gateway to Marconi we run scp command

scp ~g2tomz/public/imas-installer-20180921112143.tar.xz username@login.marconi.cineca.it:

Once it has been copied we log in Marconi and run the following commands to load the image and create the container

$HOME/.local/bin/udocker load -i imas-installer-20180921112143.tar.xz

$HOME/.local/bin/udocker create --name=imas imas-installer:20180921112143

 

 

7. Test image on Marconi

Once the image has been loaded we can interact with it and get a bash inside the container. To interact with the image we can get a bash with the following command

$HOME/.local/bin/udocker run imas /bin/bash

The commands that we are going to run for our example case are the following (same as used in uDocker documentation)

module load imas kepler
module load keplerdir
imasdb test
export USER=imas
kepler -runwf -nogui -user imas /home/imas/simple-workflow.xml

 

Running these commands inside the terminal will make the workflow start running

8. Configure and submit workflow, then check output on GW

To configure a job we have to edit the files copied on the

mkdir $HOME/.sumi
cp sumi/conf/*.conf $HOME/.sumi

8.1. jobs.conf

The configuration file jobs.conf located at local directory $HOME/.sumi/ contains the configuration for the jobs to be run. The sample configuration file located at $SUMI_DIR/conf/jobs.conf has the following content.

    [test]
    udocker = udocker.py
    arguments =
    cpus = 1
    time = 1
    threads_per_process = 1

8.2. servers.conf

The configuration file servers.conf located at local directory $HOME/.sumi/ contains the configuration for the servers where SUMI will connect The sample configuration file located at $SUMI_DIR/conf/servers.conf has the following content.


[machine]
server = example.com
user = username
manager = slurm
protocol = ssh
upload_files =
upload_to =
download_files =
download_to =

 

To configure the login node of y our cluster just specify the login node address, your user name and the name of the resource manager where the accepted are sge, slurm and pbs.

SUMI allows to upload and download files automatically. For this we can assume a directory "mywf" in our remote Marconi directory and another one in our local Gateway account, as well as a mywfresults folder on Gateway. This will

 

[marconi]
server = login.marconi.cineca.it
user = my_username
manager = slurm
protocol = ssh
upload_files = /afs/eufus.eu/g2itmdev/user/my_username/mywf/*
upload_to = /marconi/home/userexternal/agutierr/mywf/
download_files = /marconi/home/userexternal/agutierr/mywf/test*
download_to = /afs/eufus.eu/g2itmdev/user/g2agutie/mywfresults/

Once the job has been configured we can run it using the following command

sumi -r -j test -m marconi

 

This will copy the files, run the workflow and retrieve the output results. Once we have them we can check whether the

idsdump 1 1 pf_active

And this will show the correct structre of the IDS generated.

 

 

 

  • No labels