You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

1. IMAS – introduction to basic concepts

1.1. Key IMAS element: the Data Model

  • Joint scientific exploitation of ITER by multiple teams requires a Data Model to name and communicate physical and technical information → ITER Physics Data Model
  • The Data Model provides information for data providers and data consumers on…
    • What data exist ?
    • What are they called ?
    • How are they structured as seen by the user ?
  • IMAS components use the ITER Physics Data Model to:
    • Read & Write simulation results or experimental data
    • Interface codes together

1.2. The ITER Physics Data Model

  • Aims at being the main gate to data for scientific exploitation, both for code interfacing and hands-on data browsing
  • is unique for simulated and experimental data (same data structures)
  • is device-generic → usable for ITER or any other fusion device
  • has precise design rules for global homogeneity
  • has precise lifecycle procedure to be able to evolve and be jointly developped by multiple teams

1.3. Access Layer

  • API providing access methods (read/write) to an ITER physics Database based on the ITER Physics Data Model
  • Provided in Fortran, C++, Matlab, Java, Python
  • The only effort for using the Data Model is to map the input/output of your code to the Data Model and add some GET/PUT commands
  • The access methods are writing to a local database stored in your account
  • These local databases can be shared among users (for reading only) and can be accessed remotely

1.4. First use case: User or code accessing Data Base through Access Layer

1.5. Interface Data Structures (IDS) to couple codes

  • For Integrated Modelling, the Data Model also defines Interface Data Structures (IDS). These are structures within the Data Model that are used as standard interfaces between codes
  • Solves the N2 problem (large number of components from various ITER members expected in IMAS)
  • The usage of the IDSs makes the coupling of codes straightforward if they are in the same programming language
  • The usage of the IDSs + AL allows coupling of codes even if they are not written in the same language
  • The usage of the IDSs does NOT constrain your choice of coupling method. Codes can be coupled as:
    • Subroutines within a main program
    • Executables within a script
    • Components within a workflow engine

1.6. Second use case: codes coupled together directly (same language) or through AL (different languages)

1.7. Workflow Engine and Component Generator to facilitate the development of Integrated Modelling workflows

  • An Integrated Modelling simulation is described as a workflow with physics codes as components (modules)
  • The workflow engine allows users to 
    • Design the workflow
    • Choose its components and tune their code-specific parameters
    • Execute the workflow

  • The workflow engine will be used to help designing sophisticated workflows (e.g. Plasma Reconstruction chain, fully modularized Transport Solver, …)
    • It is intuitive enough for allowing “mere users” developping their own workflows
    • It hides the complexity of code coupling, data transfer, remote job submission, …
    • It allows sharing codes and workflows
    • It allows coupling to the PCS Simulation Platform

  • Component generator: is a user tool that turns an IDS-compliant physics code into a component of the workflow

1.8. Physics codes + Data Access wrapped into a workflow component

1.9. Workflow components coupled and executed within a workflow engine

1.10. The layered structured: from the physics solver to the launcher

  • The structure is layered so that functionalities are clearly separated
  • It is generic and independent of e.g. the launching script/workflow engine

  • The Physics solver part (dark green) is not changed, it is not linked to the ITER Data Model. It may use the ITER Data Model internally or not.
  • The architecture is identical to the case of a component called within a workflow engine
  • The Physics Subroutine can be directly reused to generate a workflow component – the IMAS Infrastructure provides a tool that generates the component (pink part) automatically

  • Exception: for codes handling massive amounts of data, Data Access is usually parallelised and must be done inside the physics_solver (no processor has enough memory to gather all data)

2. More details on the ITER Physics Data Model

2.1. Data Model: Interface Data Structure

  • The Data Model has a tree structure, for the sake of clarity
  • At the top level, a collection of modular structures representing
    • Abstract physical quantities (e.g. distribution functions)
    • Tokamak subsystems (e.g. PF systems)
  • These modular structures have the appropriate granularity for exchange in an IM workflow → they also represent standardised interfaces for communication between codes, named Interface Data Structure (IDS)
    • Each has an “ids_properties” substructure (metadata + comments + timebase usage)
    • Each has a “code” substructure (trace the code-specific parameters of the code that has generated this IDS)
    • Each has a generic timebase (“time”)

2.2. Data Model documentation

  • Dynamically generated

  • Open the documentation by typing: dd_doc

2.3. What is in the documentation ?

  • List of all IDSs. For each of them, a detailed documentation:
  • Full path name: name of all variables of the IDSs, with their path in the structure. Replace “/” by the structure operator in a programming language, e.g. “%” in Fortran, “.” in C++, Matlab, Java, Python
  • Description
  • Definition
  • Units in []
  • In {}, whether it is STATIC (constant over a range of pulses, e.g. machine configuration), CONSTANT (constant over the pulse or the simulation), or DYNAMIC (time-dependent within the pulse or the simulation)
  • Data_Type: indicates whether it is a string, an integer or a real, and its dimension (0D, 1D, 2D, …)
  • Coordinates: for each dimension, the full path name to the related coordinate. If the dimension simply refers to a quantity not present in the Data Model, it is indicated as “1…N”

2.4. Exercise: use the Data Model documentation

  • Go to the DM documentation and answer the following questions:
  • How many IDSs have been defined ?
  • Where can I find the toroidal flux profile calculated by my equilibrium code ?
  • What are its units ?
  • Does it vary during the pulse ?
  • How many dimensions does it have ?
  • What are its axes ?
  • Assume I have retrieved a full equilibrium structure in my Fortran program, what syntax would I use for this variable ?

2.5. Arrays of structure

  • Arrays of structures are used when a list of objects have nodes of different sizes, in order to avoid creating large sparse arrays 
  • Two kinds of arrays of structure are distinguished:
    • Case 1: The structure contains asynchronous nodes, e.g. PF coils may be acquired with different timebases. See pf_active/coil is a vector, in Fortran: pf_active%coil(i1). For each coil, the current is a “data+time” structure, i.e. each coil current has its own timebase:
      • pf_active%coil(i1)%current%data(itime)
      • pf_active%coil(i1)%current%time(itime)
    • These Case 1 AoS are used essentially in IDSs representing tokamak subsystems
  • Two kinds of arrays of structure are distinguished:
    • Case 2: The coordinate of the array of structure is a timebase. An index of the array of structure represents a time slice. As a consequence, the structure contains only dynamic and synchronous nodes, e.g. equilibrium/time_slice(itime). This time slice representation allows the size of the children to vary as a function of time (e.g. variable grid size).
    • These Case 2 AoS are used essentially in IDSs representing abstract physical quantities.

2.6. IDS can be used in two different ways: Homogeneous timebase or not

  • The Data Model provides the flexibility that every node has its own timebase. This is mandatory to represent experimental data as it has been acquired.
    • pf_active%coil(i1)%current%data(itime)
    • pf_active%coil(i1)%current%time(itime)
  • However, a frequent use case is that a code will provide its output IDS(s) on a unique timebase
  • Therefore there is a simplifying option to use an IDS structure with a homogeneous timebase, i.e. that will apply to all dynamic nodes of the IDS. This timebase is located at the top level of every IDS
    • pf_active%time
  • The ids_properties/homogeneous_time flag tells whether the IDS has been written with a homogeneous (unique) timebase (1) or not (0). In the latter case, the coordinates documentation provides the information on the localisation of the timebase in the structure. 
  • The code writing the IDS has the responsibility of defining this parameter and fill the appropriate time coordinate(s)

2.7. Data Entries

  • A Data Entry is a collection of potentially all IDS 
  • Multiple occurrences of a given IDS can co-exist, e.g. multiple equilibria calculated by different codes / assumptions
  • A Data Entry is defined by:
    • IMAS version
    • User name
    • Machine name
    • Pulse number
    • Run number
  • The recommended usage for a Simulation is that 
    • The simulation starts by reading data from an Input Data Entry (can be the from of another User)
    • During the simulation, intermediate results are stored in a temporary “work”
    • Entry (another Run number)
    • During or at the end of the run, the results intended to be archived are written to an Output Data Entry (Run number)

 

 

 

 

  • No labels