Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Info
titleCredits

Tutorial material published on this page was initially prepared by:

Dr Frédéric IMBEAUX
Head of the Fusion Plasma Physics department

Coordinator of the EUROfusion Core Programming Team
CEA/DRF/Institut de Recherche sur la Fusion par confinement Magnétique




Table of Contents

IMAS – introduction to basic concepts

...

  • Aims at being the main gate to data for scientific exploitation, both for code interfacing and hands-on data browsing
  • is unique for simulated and experimental data (same data structures)
  • is device-generic → usable for ITER or any other fusion device
  • has precise design rules for global homogeneity
  • has precise lifecycle procedure to be able to evolve and be jointly developped developed by multiple teams

Access Layer

  • API providing access methods (read/write) to an ITER physics Database based on the ITER Physics Data Model
  • Provided in Fortran, C++, Matlab, Java, Python
  • The only effort for using the Data Model is to map the input/output of your code to the Data Model and add some GET/PUT commands
  • The access methods are writing to a local database stored in your account
  • These local databases can be shared among users (for reading only) and can be accessed remotely

First use case: User or code accessing Data Base through Access Layer

Image Removed


Interface Data Structures (IDS) to couple codes

  • For Integrated Modelling, the Data Model also defines Interface Data Structures (IDS). These are structures within the Data Model that are used as standard interfaces between codes
  • Solves the N2 problem (large number of components from various ITER members expected in IMAS)
  • The usage of the IDSs makes the coupling of codes straightforward if they are in the same programming language
  • The usage of the IDSs + AL allows coupling of codes even if they are not written in the same language
  • The usage of the IDSs does NOT constrain your choice of coupling method. Codes can be coupled as:
    • Subroutines within a main program
    • Executables within a script
    • Components within a workflow engine

Second use case: codes coupled together directly (same language) or through AL (different languages)

Image Removed

Workflow Engine and Component Generator to facilitate the development of Integrated Modelling workflows

...

Physics codes + Data Access wrapped into a workflow component

Image Removed

Workflow components coupled and executed within a workflow engine

Image Removed

The layered structured: from the physics solver to the launcher

Image Removed

  • The structure is layered so that functionalities are clearly separated
  • It is generic and independent of e.g. the launching script/workflow engine
  • The Physics solver part (dark green) is not changed, it is not linked to the ITER Data Model. It may use the ITER Data Model internally or not.
  • The architecture is identical to the case of a component called
    • within a workflow engine
    The Physics Subroutine can be directly reused to generate a workflow component – the IMAS Infrastructure provides a tool that generates the component (pink part) automatically
  • Exception: for codes handling massive amounts of data, Data Access is usually parallelised and must be done inside the physics_solver (no processor has enough memory to gather all data)

...

Data Model: Interface Data Structure

...

  • List of all IDSs. For each of them, a detailed documentation:
  • Full path name: name of all variables of the IDSs, with their path in the structure. Replace “/” by the structure operator in a programming language, e.g. “%” in Fortran, “.” in C++, Matlab, Java, Python
  • Description
  • Definition
  • Units in []
  • In {}, whether it is STATIC (constant over a range of pulses, e.g. machine configuration), CONSTANT (constant over the pulse or the simulation), or DYNAMIC (time-dependent within the pulse or the simulation)
  • Data_Type: indicates whether it is a string, an integer or a real, and its dimension (0D, 1D, 2D, …)
  • Coordinates: for each dimension, the full path name to the related coordinate. If the dimension simply refers to a quantity not present in the Data Model, it is indicated as “1…N”

Exercise: use the Data Model documentation

  • Go to the DM documentation and answer the following questions:
  • How many IDSs have been defined ?

    • Answer: 46

  • Where can I find the toroidal flux profile calculated by my equilibrium code ?

    • Answer: In the equilibrium IDS, search for “toroidal flux”, found at path time_slice(:)/profiles_1d/phi

  • What are its units ?
    • Answer: Wb
  • Does it vary during the pulse ?
    • Answer: Yes, it is dynamic
  • How many dimensions does it have ?
    • Answer: 1D (float)
  • What are its axes ?
    • Answer: time_slice(:)/profiles_1d/psi
  • Assume I have retrieved a full equilibrium structure in my Fortran program, what syntax would I use for this variable ?
    • Answer: equilibrium%time_slice(:)%profiles_1d%phi

Arrays of structure

  • Arrays of structures are used when a list of objects have nodes of different sizes, in order to avoid creating large sparse arrays 
  • Two kinds of arrays of structure are distinguished:
    • Case 1: The structure contains asynchronous nodes, e.g. PF coils may be acquired with different timebases. See pf_active/coil is a vector, in Fortran: pf_active%coil(i1). For each coil, the current is a “data+time” structure, i.e. each coil current has its own timebase:
      • pf_active%coil(i1)%current%data(itime)
      • pf_active%coil(i1)%current%time(itime)
    • These Case 1 AoS are used essentially in IDSs representing tokamak subsystems
  • Two kinds of arrays of structure are distinguished:
  • Case 2: The coordinate of the array of structure is a timebase. An index of the array of structure represents a time slice. As a consequence, the structure contains only dynamic and synchronous nodes, e.g. equilibrium/time_slice(itime). This time slice representation allows the size of the children to vary as a function of time (e.g. variable grid size).
  • These Case 2 AoS are used essentially in IDSs representing abstract physical quantities.

IDS can be used in two different ways: Homogeneous timebase or not

...

  • A Data Entry is a collection of potentially all IDS 
  • Multiple occurrences of a given IDS can co-exist, e.g. multiple equilibria calculated by different codes / assumptions
  • A Data Entry is defined by:
    • IMAS version
    • User name
    • Machine name
    • Pulse number
    • Run number
  • The recommended usage for a Simulation is that 
    • The simulation starts by reading data from an Input Data Entry (can be the from of another User)
    • During the simulation, intermediate results are stored in a temporary “work”
    • Entry (another Run number)
    • During or at the end of the run, the results intended to be archived are written to an Output Data Entry (Run number)

Where to find your local IMAS data repository ?

  • Answer #1: you do not care because the Access Layer will know where to find it
  • Answer #2: you care if you want to test that your program has indeed written something
  • ls -gtr ~/public/imasdb/test/3/mdsplus/0
    • test - machine name
    • 3 - IMAS version (major)
    • 0 - Additional folder level to store RUN numbers beyond 10000

  • The file names are:
    • ids_PulseRun.tree
    • ids_PulseRun.datafile
    • ids_PulseRun.characteristics
  • Where Pulse is the pulse number and Run is the 4 rightmost digits of the run number of the Data Entry.
  • Example: PULSE 22, RUN 2 consists of 3 files:
    • ids_220002.tree
    • ids_220002.datafile
    • ids_220002.characteristics

Data Dictionary Lifecycle

...