Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The Data Model has a tree structure, for the sake of clarity
  • At the top level, a collection of modular structures representing
    • Abstract physical quantities (e.g. distribution functions)
    • Tokamak subsystems (e.g. PF systems)
  • These modular structures have the appropriate granularity for exchange in an IM workflow → they also represent standardised interfaces for communication between codes, named Interface Data Structure (IDS)
    • Each has an “ids_properties” substructure (metadata + comments + timebase usage)
    • Each has a “code” substructure (trace the code-specific parameters of the code that has generated this IDS)
    • Each has a generic timebase (“time”)

 

Data Model: Occurrences

There can be multiple instances, or “occurrences” of a given IDS in a Database Entry  (see 5.2) or used in an IMAS workflow. These occurrences can correspond to different methods for computing the physical quantities of the IDS, or to different functionalities in a workflow (e.g. store initial values, prescribed values, values at next time step, …).

By default, the IDS name without specification of the occurrence number (e.g. “equilibrium”) corresponds to occurrence “0”. IDS occurrences above the default value (occurrence “0”) are accessed by concatenating the name of the IDS with the occurrence number, with a “/” in between. For example “equilibrium/2” is the name of the occurrence number 2 of the equilibrium IDS. Note that “equilibrium/0” is not valid (temporary limitation).

In the present implementation, there is a pre-set maximum number of occurrences of a given IDS usable in a Database entry or in a workflow. This number is indicated in the documentation in the “Max. occurrence number” column of the list of IDS table. This limitation should be removed in the future

Data Model documentation

  • Dynamically generated

  • Open the documentation by typing: dd_doc

...

  • Go to the DM documentation and answer the following questions:
  • How many IDSs have been defined ?

    • Answer: 46

  • Where can I find the toroidal flux profile calculated by my equilibrium code ?

    • Answer: In the equilibrium IDS, search for “toroidal flux”, found at path time_slice(:)/profiles_1d/phi

  • What are its units ?
    • Answer: Wb
  • Does it vary during the pulse ?
    • Answer: Yes, it is dynamic
  • How many dimensions does it have ?
    • Answer: 1D (float)
  • What are its axes ?
    • Answer: time_slice(:)/profiles_1d/psi
  • Assume I have retrieved a full equilibrium structure in my Fortran program, what syntax would I use for this variable ?
    • Answer: equilibrium%time_slice(:)%profiles_1d%phi

Arrays of structure

  • Arrays of structures are used when a list of objects have nodes of different sizes, in order to avoid creating large sparse arrays 
  • Two kinds of arrays of structure are distinguished:
    • Case 1: The structure contains asynchronous nodes, e.g. PF coils may be acquired with different timebases. See pf_active/coil is a vector, in Fortran: pf_active%coil(i1). For each coil, the current is a “data+time” structure, i.e. each coil current has its own timebase:
      • pf_active%coil(i1)%current%data(itime)
      • pf_active%coil(i1)%current%time(itime)
    • These Case 1 AoS are used essentially in IDSs representing tokamak subsystems
  • Two kinds of arrays of structure are distinguished:
    • Case 2: The coordinate of the array of structure is a timebase. An index of the array of structure represents a time slice. As a consequence, the structure contains only dynamic and synchronous nodes, e.g. equilibrium/time_slice(itime). This time slice representation allows the size of the children to vary as a function of time (e.g. variable grid size).
    • These Case 2 AoS are used essentially in IDSs representing abstract physical quantities.

IDS can be be used in  in two different ways: Homogeneous timebase or not

  • The Data Model provides the flexibility that every node has its own timebase. This is mandatory to represent experimental data as it has been acquired.
    • pf_active%coil(i1)%current%data(itime)
    • pf_active%coil(i1)%current%time(itime)
  • However, a frequent use case is that a code will provide its output IDS(s) on a unique timebase
  • Therefore there is a simplifying option to use an IDS structure with a homogeneous timebase, i.e. that will apply to all dynamic nodes of the IDS. This timebase is located at the top level of every IDS
    • pf_active%time
  • The ids_properties/homogeneous_time flag tells whether the IDS has been written with a homogeneous (unique) timebase (1) or not (0). In the latter case, the coordinates documentation provides the information on the localisation of the timebase in the structure. 
  • The code writing the IDS has the responsibility of defining this parameter and fill the appropriate time coordinate(s)

IDS and time slices

  • An IDS potentially contains many time slices, possibly in different time bases
  • Because this is used frequently during workflows, time slicing operations are allowed by the Access Layer
    • GET_SLICE returns an IDS with all time dimensions of size 1 (representing thus a single "time slice"). Dynamic signals are interpolated (different options available)
    • PUT_SLICE appends the content of an IDS variable (with all time dimensions of size) to an IDS stored on disk. This allows accumulating time slices in an IDS progressively during a time loop
  • More options can be added in the future
  • In a Kepler workflow, only the reference of the IDS is circulating, so operations can be performed on this IDS either in SLICE mode or in FULL mode (applies to the full IDS with all time slices)

Data Entries

  • A Data Entry is a collection of potentially all IDS 
  • Multiple occurrences of a given IDS can co-exist, e.g. multiple equilibria calculated by different codes / assumptions
  • A Data Entry is defined by:
    • IMAS version
    • User name
    • Machine name
    • Pulse number
    • Run number
  • The recommended usage for a Simulation is that 
    • The simulation starts by reading data from an Input Data Entry (can be the from of another User)
    • During the simulation, intermediate results are stored in a temporary “work”
    • Entry (another Run number)
    • During or at the end of the run, the results intended to be archived are written to an Output Data Entry (Run number)

...