1. Introduction: Basic concepts

In this tutorial

what CPOs are
what UAL is

Table of Contents

1.1. 1.1. CPO - Consistent Physical Object

A Consistent Physical Object (CPO), is a data structure that contains the relevant information on a Physical Entity, e.g. the plasma equilibrium, the core plasma profiles (densities, temperature etc.), distribution functions, etc.

What is important from the user's perspective is the way CPO data are stored. Each CPO structure can be accessed via UAL documentation at the ITM portal ITM Portal.

Accessing documentation structure is fairly easy. Open ITM Portal web page in browser and navigate to ISIP related documentation.

ITM Portal page

After documentation is opened browse to: "Data structure" (it is located at the bottom of the picture)

ITM Portal - ISIP related section

You should see web page similar to the content of picture below

UAL documentation

After choosing "Browse" you will see the structure of the CPOs.

CPOs within data structure

What you can see at the picture above is collection of CPOs. All CPOs are bound to the top level element. After you expand particular CPO you can browse it's details. In this case magdiag (Magnetic diagnostics) was expanded.

Downloading structure

If you plan to browse structure extensively, it is advised to download it to your local system.

Exercise no. 1 - After this exercise you will:

know how to find CPO related documentation
know how to browse through the tree structure
know how to access CPO details

Exercise no. 1 (approx. 10 min)

In this exercise you will browse documentation of CPO elements.

1. Log in into ITM Portal link

2. Browse to Documentation -> ISIP -> Data structure

3. Choose "Browse" from "Data structure 4.10b (Browse) (Download)"

4. Check few CPOs to get familiar with documentation

1.1.1. 1.1 Structure of the CPO

In this section we will take a closer look at the data. After copying MDSPlus database files into your account's public area you can dump the data using cpodump script.

cpodump 13 3 equilibrium

1.1.1.1. 1.1.1 Time independent CPOs vs. time dependent CPOs

In general a CPO can contain both time-dependent and time-independent information. Typically, information related to the tokamak hardware will not be time-dependent, while the value of plasma physical quantities will likely be time dependent. Therefore the fields of a CPO can be either time- dependent or time-independent. The CPO itself will be time-independent if it contains only time- independent fields. It will be time-dependent otherwise. Only few CPOs in the ITM database are time-independent (e.g. topinfo), while the others will describe physical phenomena which vary over time.

source: UAL User Guide

1.1.1.2. 1.1.2 Time slices

Time is of course a key physical quantity in the description of a tokamak experiment. Present day integrated modelling codes all have their workflow based on time evolution. Time has however a hybrid status because of the large differences in time scales in the various physical processes. For instance, the plasma force balance equilibrium (which yields the surface magnetic topology) or Radio Frequency wave propagation are established on very fast ime scales with respect to other processes such as macroscopic cross-field transport. Therefore such physical problems are often considered as “static” by other parts of the workflow, which need to know only their steady-state solution. In fact such a configuration appears quite often in the physics workflows: a physical object that varies with time during an experiment (e.g., the plasma equilibrium) can also be considered as a collection of static time slices, and the other modules in the workflow will need only a single static time slice of it to solve their own physical problem at a given time. Therefore the CPO organisation must be convenient for both

From "A generic data structure for integrated modelling of tokamak physics and subsystems", F. Imbeaux at al.https://infoscience.epfl.ch/record/153304/files/1005201.pdf

Time-dependent CPOs are treated as arrays of the elementary CPO structure, i.e. in Fortran: equilibrium( : ) is a pointer and equilibrium( i ) is an equilibrium structure corresponding to time index i. Since many physics code manipulate only one time slice at a time, special UAL functions exist to extract a single time slice of the CPO : these are the GET_SLICE and PUT_SLICE functions.

source: UAL User Guide

Each time slice is located at given index. When you are looking for a time slice, there are three possible ways of finding "correct" time slice.

1 : CLOSEST_SAMPLE
the returned CPO is the stored CPO whose time is closest to the passed time;
2 : PREVIOUS_SAMPLE
the CPO whose time is just before the passed time is returned.
3 : LINEAR INTERPOLATION
the values of the time-dependent return CPO fields are computed according to a linear interpolation between the corresponding values of the CPOs just before and after the passed time;

The CPOs are the transferable units for exchanging physics data

in a workflow. For instance, a module calculating a source term for

a transport equation will exchange this information via a “core-

source” CPO. For modelling tokamak experiments, multiple source

terms are needed because many different heating/fuelling/current

drive mechanisms are commonly used simultaneously. In present

day integrated modelling codes, this is usually done by pre-

defining in the data structure slots for each of the expected heating

methods. Moreover, the way to combine these source terms is also

usually pre-defined, though present codes have some possibility to

tune their workflows with a set of flags – still within a pre-defined

list of possible options. In view of providing a maximum flexibility

of the workflow (which can also be something completely differ-

ent from the usual “transport” workflows), we have to abandon

this strategy and go for multiple CPO occurrences.

For each physical problem we have defined a CPO type that

must be used to exchange information related to this problem.

While the elementary CPO structure is designed for a single time

slice, we have gathered all time slices referring to the same phys-

ical object in an array of CPO time slices. This array is the unit

which is manipulated when editing workflows. We now introduce

the possibility to have in a workflow multiple CPO occurrences, i.e.

multiple occurrences of arrays of CPOs of the same type.

Themultiplesourcetermsexampleisagoodonetoillustrate

this additional level of complexity. We may have in the workflow

an arbitrary number of source modules, each of them producing

a “coresource” output. These output CPOs must be initially clearly

separated since they are produced by independent modules, there-

fore they are stored in multiple occurrences of the generic “core-

source” CPO. The various source modules may be called at different

times of the workflow, this is allowed since all occurrences are in-

dependent: they can have an arbitrary number of time slices and

each has its own time base. In the end, the transport solver ac-

tor needs only a single “coresource” CPO as input. This one has to

be another occurrence of “coresource”, created by combining the

previous “coresource” CPOs directly produced by the physics mod-

ules. How this combination is done is fully flexible and left to the

user’s choice, since this is a part of the workflow. Likely he will

have to use a “combiner” actor that takes an arbitrary number

of “coresource” CPO occurrences and merges them into a single

“coresource” CPO occurrence. For the moment, the details of the

merging procedure have to be written in one of the ITM-TF lan-

guages (Fortran 90/95, C++, Java, Python) and then wrapped in the

same way as the physical modules to become the “combiner” ac-

tor. The ideal way of doing this would be to define for each CPO

the meaning of simple standard operations, such as the “addition”

of two CPOs of the same type, and be able to use this as an opera-

tor or a generic actor in the KEPLER workflow. This possibility will

be investigated in the future.

Another example of use for multiple CPO occurrences is the

need for using equilibria with different resolutions in the same

workflows. Some modules need to use only low resolution equilib-

rium, while others (such as MHD stability modules) need a much

higher resolution. To produce this, multiple equilibrium solver ac-

tors must appear in the workflow and store their output in multi-

ple occurrences of the “equilibrium” CPO.

During a workflow, multiple occurrences of CPOs of the same

type can be used, when there are multiple actors solving the same

physical problem. In most cases, these occurrences can be used

for exchanging intermediate physical data during the workflow, i.e.

data that the user does not want to store in the simulation results

(such as all internal time step of a transport equation solver). Each

link drawn between actors in the KEPLER workflow refer to a CPO

type and to an occurrence number (and all its time slices), in order

to define the CPO unambiguously. The user decides which CPOs

(type

+

occurrence) are finally stored in the simulation output

database entry while the others are discarded after the simulation.

Fig. 4

shows a toy example of a complex workflow with branching

data flows and use of multiple CPO occurrences. Without entering

into the details of the ITM-TF database, we simply underline the

fact that a database entry consists of multiple occurrences of po-

tentially all types of CPOs (see

Fig. 5

). A database entry is meant

to gather a group of collectively consistent CPOs (produced, e.g., by

a complete integrated simulation, or an experimental dataset to be

used for input to a simulation, etc.).

We see how all the principles, concepts and rules that we have

build up starting from the elementary single time slice CPO defi-

nition allow the creation of modular and fully flexible workflows

with strong guarantees on data consistency. The apparent complex-

ity introduced by the modular structure of the data is eventually

a key ingredient to achieve such a goal, which would likely not be

reachable or strongly error-prone in a system with a flat unstruc-

tured data model.

1.2. 1.2. UAL - Universal Access Layer

In order to cope with multiple languages and maintaining at the same time a unique structure definition, the UAL architecture defines two layers. The top layer provides the external Application Programming Interface (API), and its code is automatically produced from the XML description of the ITM database structure. For each supported programming language, a high level layer is generated in the target language. The following sections will describe the language specific API, and they provide all the required information for simulation program developers.

The lower layer is implemented in C and provides unstructured data access to the underlying database. It defines an API which is used by all the high level layer implementations. Knowledge of this API (presented in a later section) is not necessary to end users, and is only required to the developers of new language specific high level implementations of the UAL as well as the developers of support tools for ITM management.

source: UAL User Guide

Page tree

1. Introduction: Basic concepts

1.1. 1.1. CPO - Consistent Physical Object

1.1.1. 1.1 Structure of the CPO

1.1.1.1. 1.1.1 Time independent CPOs vs. time dependent CPOs

1.1.1.2. 1.1.2 Time slices

1.2. 1.2. UAL - Universal Access Layer