1. IMAS Primer
1.1. What is IMAS?
1.2. IMAS Data Model & IDS (Frederic) - 20.09
1.2.1. IDS and time: homogenous, heterogenous, independent
1.2.2. occurences
1.2.3. slices
1.3. Database entries
1.3.1. MDSPlus pulse files
2. IMAS Access Layer - 20.09
2.1. The goals of Access Layer
The Access Layer (or AL) is the central data access library, which allow data access for the users/applications through various APIs and programming languages.
Thus, its main purpose is to provide mechanisms for reading, writing and manipulating IDS data objects, as being defined in the Data Dictionary (DD).
2.2. Access Layer architecture (Bartek)
In order to cope with multiple languages and maintaining at the same time a unique structure definition, the AL architecture defines a few layers.
2.2.1. Application Layer
Application Layer is the layer of users programs or dedicated tools that manipulates IDS data through High Level Interfaces
2.2.2. High Level Interfaces
This layer provides the external Application Programming Interface (API), and its code is automatically produced from the XML description of the ITM database structure. For each supported programming language, a high level layer is generated in the target language.
High Level Interfaces available in AL include:
- Fortran
- C++
- Matlab
- Java
- Python
Methods exposed by High Level Interfaces:
- —Operations on data base entry
- —CREATE
- —OPEN
- —DELETE
- —CLOSE
- —Operations on IDSes - AL operates at the IDS level (with some exceptions) providing only methods for “atomic” operations such as:
- —PUT
- —GET
- —PUT_SLICE
- —GET_SLICE
2.2.3. Low Level
The Low Level layer is implemented in CPP (but with C API) and provides unstructured data access to the underlying databases/backends. It defines an API which is used by all the high level layer implementations. Knowledge of this API (presented in a later section) is not necessary to end users, and is only required to the developers of new language specific high level implementations of the AL as well as the developers of support tools.
2.2.4. Backends
Backends are plug-ins that allows for interaction between an abstract Low Level layer and physical storages.
Currently implemented backends allows to store data in: memory cache, as MDSPlus files, HDF5 files and ASCII files (this BE is used mainly for testing purposes)
2.3. High Level Interfaces and their API (Application Programming Interface)
There are currently 5 High Level Interfaces (HLIs) available from the following programming languages:
- Fortran
- C++
- Java
- Python
- Matlab
Only Python and Matlab provide user interactive session for accessing IMAS data.
The HLI API covers all available Access Layer features:
- creating a so-called new IMAS Data Entry
- opening an existing IMAS Data Entry
- writing data from an IDS to a Data Entry
- reading data of an IDS from an existing Data Entry
- deleting an IDS from an existing Data Entry
- closing a Data Entry
A Data Entry is an IMAS concept for designating a pulse with given shot and run numbers located in some database (see below).
2.3.1. HLI API (Ludovic)
As an example, we will describe the Python HLI.
Documentation of all others HLIs is available in the User guide available from this page: https://confluence.iter.org/display/IMP/Integrated+Modelling+Home+Page
2.3.1.1. create
Creating a new Data Entry using the MDS+ backend consists in creating a new pulse file on disk. Therefore, you need to have write permissions for the database specified in the create() command.
So, let's first create a new database belonging to the current user.
From a new shell, execute the following command:
module load IMAS imasdb data_access_tutorial
Now, the following code will create a new MDS+ pulse file for shot=15000, run=1 in the 'data_access_tutorial' database of the current user:
import imas import getpass from imas import imasdef #creates the Data Entry object 'data_entry' associated to the pulse file with shot=15000, run=1, belonging to database 'pcss_tutorial' of the current user, using the MDS+ backend data_entry = imas.DBEntry(imasdef.MDSPLUS_BACKEND, 'data_access_tutorial, 15000, 1, user_name=getpass.getuser()) #creates the pulse file associated to the Data Entry object 'data_entry' previously created data_entry.create() #close the pulse file associated to the 'data_entry' object data_entry.close()
Execution of the code above will create the pulse file at location ~/public/imasdb/data_access_tutorial/3/0:
$ ls -alh ~/public/imasdb/data_access_tutorial/3/0 total 78M drwxrwsr-x 2 fleuryl fleuryl 4.0K Aug 31 10:09 . drwxrwsr-x 12 fleuryl fleuryl 4.0K Aug 31 10:09 .. -rw-rw-r-- 1 fleuryl fleuryl 42M Aug 31 10:09 ids_150000001.characteristics -rw-rw-r-- 1 fleuryl fleuryl 37 Aug 31 10:09 ids_150000001.datafile -rw-rw-r-- 1 fleuryl fleuryl 36M Aug 31 10:09 ids_150000001.tree
2.3.1.2. open
The following code opens the existing MDS+ pulse file created previously for shot=15000, run=1, from the 'data_access_tutorial' database of the current user:
import imas import getpass from imas import imasdef #creates the Data Entry object 'data_entry' associated to the pulse file with shot=15000, run=1, belonging to database 'data_access_tutorial' of the current user, using the MDS+ backend data_entry = imas.DBEntry(imasdef.MDSPLUS_BACKEND, 'data_access_tutorial, 15000, 1, user_name=getpass.getuser()) #opens the pulse file associated to the Data Entry object 'data_entry' previously created data_entry.open()
The pulse file is opened, however no data have been yet fetched from the pulse file.
2.3.1.3. put/putSlice
IDSs are data containers described by the IMAS Data Dictionary. IDSs represent either a Diagnostics (like the 'bolometer' IDS), or a System (like the 'camera_ir'), or a concept like the 'equilibrium' IDS representing the plasma equilibrium.
In order to write IDS data to the pulse file, we will first use the put() operation which writes all static (non time dependent) and dynamic data from an IDS.
Let's add a 'magnetics' IDS to the pulse file previously created.
The first part of the code below is opening a data_entry (see 2.2.1.2.), then a magnetics IDS is created and written to the data_entry using the put() operation:
import imas import getpass import numpy as np from imas import imasdef #creates the Data Entry object 'data_entry' associated to the pulse file with shot=15000, run=1, belonging to database 'data_access_tutorial' of the current user, using the MDS+ backend data_entry = imas.DBEntry(imasdef.MDSPLUS_BACKEND, 'data_access_tutorial, 15000, 1, user_name=getpass.getuser()) #opens the pulse file associated to the Data Entry object 'data_entry' previously created data_entry.open() magnetics_ids = imas.magnetics() #creating a 'magnetics' IDS magnetics_ids.ids_properties.homogeneous_time=1 #setting the homogneous time to 1 magnetics_ids.ids_properties.comment='IDS created for testing the IMAS Data Access layer' magnetics_ids.time=np.array([0]) #the time(vector) basis must be not empty, otherwise an error will occur at runtime data_entry.put(magnetics_ids, 0) #writing magnetics data to the data_entry associated to the pulse file. The second argument 0 is the so-called IDS occurrence. data_entry.close()
2.3.1.4. get/getSlice
2.3.1.5. delete_data
2.3.1.6. close
2.4. Acessing data from commandline (bartek palak)
2.4.1. Listing pulse files
itmdbs command
Usage: imasdbs [OPTIONS] [COMMAND]
This program lists existing databases.
Possible commands are:
list <shot number>- list existing databases
slices <shot number> <run number> - list existing databases, including number of timeslices and time range for time-dependent IDSes
times <shot number> <run number> - list existing databases, including number of timeslices their time points for time-dependent IDSes
tokamak - list existing tokamaks (with data versions)
dataversion - list existing dataversions (with tokamaks)
If the optional arguments shot number and run number are given, only databases with these numbers will be shown.
If no command is given, the list command is performed.
To see databases stored in the public database, use 'public' as the user name.
Options:
-h, --help show this help message and exit
-u USER, --user=USER Show databases of specified user
-t TOKAMAK, --tokamak=TOKAMAK
Show only databases for specified tokamaks
-v VERSION, --version=VERSION
Show only databases for specified data version
--backend=BACKEND Show databases written with given backend(s). Comma-
separated list of backends (Currently supported:
mdsplus, hdf5). By default all backends are shown.
-c, --compact Compact/reduced output
|
|
2.4.2. Dumping pulse files
To list the content (all data) of an IDS, use idsdump
script
|
|
2.4.3. Dumping an IDS node
Getting a subset of an IDS enables reading only a node (and its descendants if the node is a structure), making the GET operation much faster. To retrieve only requested node one should call the script idsdumppath
.
|
Path syntax:
- The path to requested node(s) is separated by slashes (“/path/to/node(s)”).
- Nodes representing arrays must contain indexes (“/path/to/array(idx)/field”) or “Fortran style” indices (“path/to/array(x:y)/field”)
- Limitation: In case of nested arrays, it is not allowed to specify set of indices for AoS ancestors. Only given values of AoS ancestors indices are handled: (e.g. “field/with/ancestorAoS(x:y)/field/AoS(n :m)” is not managed)
Data query examples:
- “flux_loop(1)/flux/data(1:5)”
- “bpol_probe(2:3)/field/data”
- “loop(:)/current”
- “time(4:-1)”
- “profiles_1d(2)/grid/rho_tor_norm(2:4)”
|
|
2.4.4. Copying database files directly
In case you know user name, machine name, shot number and run number, you can import users' database files copying them directly from the users' public directories. Database files are located inside:
|
Take a look at the example below. We will copy data from user michalo, machine test, shot: 12 and run: 2
|
3. Adapting user code into IMAS - 22.09
3.1. Motivations and different levels of adaptation (Bartek Palak)
3.2. Code adaptation (Dimitriy)
3.3. Wrapping user codes into actors - iWrap (Bartek Palak)
3.3.1. motivations
3.3.2. how to prepare user code{toc}
3.3.3. wrapping (job description, iWrap)
3.3.4. usage of actor within WF
4. Dealing with experimental data (Michal P.) - 22.09
5. Adapting codes to IMAS based Docker (Tomek) - 22.09
5.1. Introduction
Docker is a tool which allows to start containers i.e. lightweight, isolated environments (OS, libraries, configurations)
- You can install Docker for Mac, Windows or Linux: documentation
To work with a container you need an image to start from
Images can be found in public or private repositories:
A container executes one or more processes in its isolated environment
The process might be a daemon e.g. Apache HTTP Server or it can be an interactive terminal
Dockerization features:
Quick prototyping and testing
- For example, you can easily spawn multiple versions of PostgreSQL and test your SQL queries against them
Better dissemination
- The product owner can share a Docker image and anyone interested can use it straight away
Enhanced security
- A container is isolated and runs a limited number of processes
- Even if it gets hacked, the rest of the system remains unharmed
Easier maintenance
- The images are usually built in an automatic way via CI/CD pipelines or regularly scheduled jobs
- No matter how complex the environment is, once the image recipe is created all interested users can instantiate containers at will