1. IMAS Primer

1.1. What is IMAS?

1.2. IMAS Data Model & IDS (Frederic) - 20.09

1.2.1. IDS and time: homogenous, heterogenous, independent

1.2.2. occurences

1.2.3. slices

1.3. Database entries

1.3.1. MDSPlus pulse files

2. IMAS Access Layer - 20.09

2.1. The goals of Access Layer

The Access Layer (or AL) is the central data access library, which allow data access for the users/applications through various APIs and programming languages.

Thus, its main purpose is to provide mechanisms for reading, writing and manipulating IDS data objects, as being defined in the Data Dictionary (DD).

2.2. Access Layer architecture (Bartek)

In order to cope with multiple languages and maintaining at the same time a unique structure definition, the AL architecture defines a few layers.

2.2.1. Application Layer

Application Layer is the layer of users programs or dedicated tools that manipulates IDS data through High Level Interfaces

2.2.2. High Level Interfaces

This layer provides the external Application Programming Interface (API), and its code is automatically produced from the XML description of the ITM database structure. For each supported programming language, a high level layer is generated in the target language.

High Level Interfaces available in AL include:

Fortran
C++
Matlab
Java
Python

Methods exposed by High Level Interfaces:

—Operations on data base entry
- —CREATE
- —OPEN
- —DELETE
- —CLOSE
—Operations on IDSes - AL operates at the IDS level (with some exceptions) providing only methods for “atomic” operations such as:
- —PUT
- —GET
- —PUT_SLICE
- —GET_SLICE

2.2.3. Low Level

The Low Level layer is implemented in CPP (but with C API) and provides unstructured data access to the underlying databases/backends. It defines an API which is used by all the high level layer implementations. Knowledge of this API (presented in a later section) is not necessary to end users, and is only required to the developers of new language specific high level implementations of the AL as well as the developers of support tools.

2.2.4. Backends

Backends are plug-ins that allows for interaction between an abstract Low Level layer and physical storages.

Currently implemented backends allows to store data in: memory cache, as MDSPlus files, HDF5 files and ASCII files (this BE is used mainly for testing purposes)

2.3. High Level Interfaces and their API (Application Programming Interface)

There are currently 5 High Level Interfaces (HLIs) available from the following programming languages:

Fortran
C++
Java
Python
Matlab

Only Python and Matlab provide user interactive session for accessing IMAS data.

The HLI API covers all available Access Layer features:

creating a so-called new IMAS Data Entry
opening an existing IMAS Data Entry
writing data from an IDS to a Data Entry
reading data of an IDS from an existing Data Entry
deleting an IDS from an existing Data Entry
closing a Data Entry

A Data Entry is an IMAS concept for designating a pulse with given shot and run numbers located in some database (see below).

2.3.1. HLI API (Ludovic)

As an example, we will describe the Python HLI.

Documentation of all others HLIs is available in the User guide available from this page: https://confluence.iter.org/display/IMP/Integrated+Modelling+Home+Page

2.3.1.1. create

Creating a new Data Entry using the MDS+ backend consists in creating a new pulse file on disk. Therefore, you need to have write permissions for the database specified in the create() command.

So, let's first create a new database belonging to the current user.

From a new shell, execute the following command:

module load IMAS
imasdb data_access_tutorial

Now, the following code will create a new MDS+ pulse file for shot=15000, run=1 in the 'data_access_tutorial' database of the current user:

import imas
import getpass
from imas import imasdef
#creates the Data Entry object 'data_entry' associated to the pulse file with shot=15000, run=1, belonging to database 'pcss_tutorial' of the current user, using the MDS+ backend
data_entry = imas.DBEntry(imasdef.MDSPLUS_BACKEND, 'data_access_tutorial, 15000, 1, user_name=getpass.getuser())
#creates the pulse file associated to the Data Entry object 'data_entry' previously created
data_entry.create()
#close the pulse file associated to the 'data_entry' object
data_entry.close()

Execution of the code above will create the pulse file at location ~/public/imasdb/data_access_tutorial/3/0:

$ ls -alh ~/public/imasdb/data_access_tutorial/3/0
total 78M
drwxrwsr-x 2 fleuryl fleuryl 4.0K Aug 31 10:09 .
drwxrwsr-x 12 fleuryl fleuryl 4.0K Aug 31 10:09 ..
-rw-rw-r-- 1 fleuryl fleuryl 42M Aug 31 10:09 ids_150000001.characteristics
-rw-rw-r-- 1 fleuryl fleuryl 37 Aug 31 10:09 ids_150000001.datafile
-rw-rw-r-- 1 fleuryl fleuryl 36M Aug 31 10:09 ids_150000001.tree

2.3.1.2. open

The following code opens the existing MDS+ pulse file created previously for shot=15000, run=1, from the 'data_access_tutorial' database of the current user:

import imas
import getpass
from imas import imasdef
#creates the Data Entry object 'data_entry' associated  to the pulse file with shot=15000, run=1, belonging to database 'data_access_tutorial' of the current user, using the MDS+ backend
data_entry = imas.DBEntry(imasdef.MDSPLUS_BACKEND, 'data_access_tutorial, 15000, 1, user_name=getpass.getuser())
#opens the pulse file associated to the Data Entry object 'data_entry' previously created
data_entry.open()

The pulse file is opened, however no data have been yet fetched from the pulse file.

2.3.1.3. put/putSlice

IDSs are data containers described by the IMAS Data Dictionary. IDSs represent either a Diagnostics (like the 'bolometer' IDS), or a System (like the 'camera_ir'), or a concept like the 'equilibrium' IDS representing the plasma equilibrium.

In order to write IDS data to the pulse file, we will first use the put() operation which writes all static (non time dependent) and dynamic data from an IDS.

Let's add a 'magnetics' IDS to the pulse file previously created.

The first part of the code below is opening a data_entry (see 2.2.1.2.), then a magnetics IDS is created and written to the data_entry using the put() operation:

import imas
import getpass
import numpy as np
from imas import imasdef
#creates the Data Entry object 'data_entry' associated  to the pulse file with shot=15000, run=1, belonging to database 'data_access_tutorial' of the current user, using the MDS+ backend
data_entry = imas.DBEntry(imasdef.MDSPLUS_BACKEND, 'data_access_tutorial, 15000, 1, user_name=getpass.getuser())
#opens the pulse file associated to the Data Entry object 'data_entry' previously created
data_entry.open() 

magnetics_ids = imas.magnetics() #creating a 'magnetics' IDS
magnetics_ids.ids_properties.homogeneous_time=1 #setting the homogneous time to 1
magnetics_ids.ids_properties.comment='IDS created for testing the IMAS Data Access layer'
magnetics_ids.time=np.array([0]) #the time(vector) basis must be not empty, otherwise an error will occur at runtime
data_entry.put(magnetics_ids, 0) #writing magnetics data to the data_entry associated to the pulse file. The second argument 0 is the so-called IDS occurrence.
data_entry.close()

2.3.1.4. get/getSlice

2.3.1.5. delete_data

2.3.1.6. close

2.4. Acessing data from commandline (bartek palak)

2.4.1. Listing pulse files

itmdbs command

Usage: imasdbs [OPTIONS] [COMMAND]

This program lists existing databases.

Possible commands are:

list <shot number>- list existing databases

slices <shot number> <run number> - list existing databases, including number of timeslices and time range for time-dependent IDSes

times <shot number> <run number> - list existing databases, including number of timeslices their time points for time-dependent IDSes

tokamak - list existing tokamaks (with data versions)

dataversion - list existing dataversions (with tokamaks)

If the optional arguments shot number and run number are given, only databases with these numbers will be shown.

If no command is given, the list command is performed.

To see databases stored in the public database, use 'public' as the user name.

Options:

-h, --help show this help message and exit

-u USER, --user=USER Show databases of specified user

-t TOKAMAK, --tokamak=TOKAMAK

Show only databases for specified tokamaks

-v VERSION, --version=VERSION

Show only databases for specified data version

--backend=BACKEND Show databases written with given backend(s). Comma-

separated list of backends (Currently supported:

mdsplus, hdf5). By default all backends are shown.

-c, --compact Compact/reduced output

shell> imasdbs -t test slices 9999 2
Tokamak: test
   Data version: 3
      UAL Backend: mdsplus
         Shot    10
             Run:     40
                 core_profiles:   25 slices (345.0 - 345.48)
                  core_sources:   25 slices (345.0 - 345.48)
                core_transport:   25 slices (345.0 - 345.48)
                   equilibrium:   25 slices (345.0 - 345.48)
               transport_solver_numerics:   25 slices (345.0 - 345.48)
                          wall:   25 slices (345.0 - 345.48)

shell> imasdbs -u palakb
Tokamak: test
   Data version: 3
      UAL Backend: mdsplus
         Shot     1 Runs:     1
         Shot     2 Runs:     3   666   777   999
         Shot    10 Runs:    30    40    42    60    61    64    65    66    80    81    99   123   666   999  1234
         Shot    12 Runs:     1     2    99
         Shot    13 Runs:     1

2.4.2. Dumping pulse files

To list the content (all data) of an IDS, use idsdump script

shell> idsdump
Usage: idsdump <USER> <TOKAMAK> <VERSION> <SHOT> <RUN> <IDS>

shell> idsdump $USER test 3 9999 2 equilibrium
class equilibrium
Attribute ids_properties
        class ids_properties
        Attribute comment:
        Attribute homogeneous_time: 1
        Attribute source:
        Attribute provider:
        Attribute creation_date:
[.......]
Attribute code
        class code
        Attribute name: 12 34 56 78 90
        Attribute commit: 12 34 56 78 90
        Attribute version: 12 34 56 78 90
        Attribute repository: 12 34 56 78 90
        Attribute parameters: 12 34 56 78 90
        Attribute output_flag
        [-819925519  678927020  358961885  263985221 -518535735 -656888240
          885898039 -949201251  187087431  189678740  306846126  536940120
         -842545485 -121858537 -867824798  103609281 -986039164 -761981263
         -444948662 -178414734   91809633  -65221224  575637439 -526052305]
Attribute time
[ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
19. 20. 21. 22. 23. 24.]

2.4.3. Dumping an IDS node

Getting a subset of an IDS enables reading only a node (and its descendants if the node is a structure), making the GET operation much faster. To retrieve only requested node one should call the script idsdumppath .

idsdumppath
Usage: idsdumppath <USER> <TOKAMAK> <VERSION> <SHOT> <RUN> <IDS> <DATA_PATH>

Path syntax:

The path to requested node(s) is separated by slashes (“/path/to/node(s)”).
Nodes representing arrays must contain indexes (“/path/to/array(idx)/field”) or “Fortran style” indices (“path/to/array(x:y)/field”)
Limitation: In case of nested arrays, it is not allowed to specify set of indices for AoS ancestors. Only given values of AoS ancestors indices are handled: (e.g. “field/with/ancestorAoS(x:y)/field/AoS(n :m)” is not managed)

Data query examples:

“flux_loop(1)/flux/data(1:5)”
“bpol_probe(2:3)/field/data”
“loop(:)/current”
“time(4:-1)”
“profiles_1d(2)/grid/rho_tor_norm(2:4)”

shell> idsdumppath $USER test 3 9999 2 equilibrium "code"
Type: <class 'imas_3_24_0_ual_4_2_0.equilibrium.code__structure'>
----------------------------------------------
----------------------------------------------
class code
Attribute name: 12 34 56 78 90
Attribute commit: 12 34 56 78 90
Attribute version: 12 34 56 78 90
Attribute repository: 12 34 56 78 90
Attribute parameters: 12 34 56 78 90
Attribute output_flag
[-819925519  678927020  358961885  263985221 -518535735 -656888240
  885898039 -949201251  187087431  189678740  306846126  536940120
-842545485 -121858537 -867824798  103609281 -986039164 -761981263
-444948662 -178414734   91809633  -65221224  575637439 -526052305]

shell> idsdumppath $USER test 3 9999 2 equilibrium "code/output_flag(0)"
Type: <class 'numpy.int32'>
----------------------------------------------
----------------------------------------------
-819925519

2.4.4. Copying database files directly

In case you know user name, machine name, shot number and run number, you can import users' database files copying them directly from the users' public directories. Database files are located inside:

~$USERNAME/public/imasdb/$TOKAMAKNAME/$DATAVERSION/0/ids_SSSSRRRR.*

Take a look at the example below. We will copy data from user michalo, machine test, shot: 12 and run: 2

# change directory in your $HOME
cd $HOME/public/imasdb/test/3/0/

# copy data files (pay attention to *_dot_* at the end of command line!)
cp ~michalo/public/imasdb/test/3/0/ids_120002.* .
cp ~michalo/public/imasdb/test/3/0/ids_130003.* .

3. Adapting user code into IMAS - 22.09

3.1. Motivations and different levels of adaptation (Bartek Palak)

3.2. Code adaptation (Dimitriy)

3.3. Wrapping user codes into actors - iWrap (Bartek Palak)

>>> iWRAP DESCRIPTION<<<

3.3.1. motivations

3.3.2. how to prepare user code{toc}

3.3.3. wrapping (job description, iWrap)

3.3.4. usage of actor within WF

4. Dealing with experimental data (Michal P.) - 22.09

5. Adapting codes to IMAS based Docker (Tomek) - 22.09

5.1. Introduction

Docker is a tool which allows to start containers i.e. lightweight, isolated environments (OS, libraries, configurations)
- You can install Docker for Mac, Windows or Linux: documentation
To work with a container you need an image to start from
Images can be found in public or private repositories:
- Base operating systems e.g. Ubuntu or CentOS
- Ready-to-start interpreters e.g. Python or PHP
- Database management systems e.g. MySQL or PostgreSQL
A container executes one or more processes in its isolated environment
The process might be a daemon e.g. Apache HTTP Server or it can be an interactive terminal
Dockerization features:
- Quick prototyping and testing
  - For example, you can easily spawn multiple versions of PostgreSQL and test your SQL queries against them
- Better dissemination
  - The product owner can share a Docker image and anyone interested can use it straight away
- Enhanced security
  - A container is isolated and runs a limited number of processes
  - Even if it gets hacked, the rest of the system remains unharmed
- Easier maintenance
  - The images are usually built in an automatic way via CI/CD pipelines or regularly scheduled jobs
  - No matter how complex the environment is, once the image recipe is created all interested users can instantiate containers at will

Page tree

Tutorial - adapting codes to IMAS