version 3.0.0

Please note that this documentation covers version 3.0.0 . There are new features in version 3.2.0 that are not yet covered in this document.

source

Note that this document is based on TeX based documentation prepared by Thomas. You can find TeX file inside GIT repository of xmllib - inside doc directory

> git clone ssh://git@git.iter.org/lib/xmllib.git
> cd xmllib/doc

1. What is XMLLIB?

XMLLIB is a library for parsing xml-les in Fortran. This means that it provides subroutines for extracting values from an xml file.

XMLLIB provides three dierent interfaces for parsing data, here referred to as the Konz, Xpath and xml2eg interfaces. The Konz interface is sometimes also referred to as the Classic interface. The xml2eg interface is the most flexible of the three, the most robust and also the most simple to use. The Konz and Xpath interfaces are kept only for legacy usage. The details of the interfaces are described in section 3.

Note that XMLLIB cannot be used to parse any xml file. The xml restrictions of the different interfaces are described in section 2.

The XMLLIB source code is stored in the ITER git repository. To clone this repository:

> git clone ssh://git@git.iter.org/lib/xmllib.git

In this repository there are examples for each of the XMLLIB interfaces.

.
|-- examples
|   |-- classic  - classic interface (legacy code)
|   |-- cpp      - C++ based sample
|   |-- xml2eg   - samples for codes based on xml2eg library
|   `-- xpath    - samples for codes based on XPath
|-- src
`-- tests

2. XML formats used in XMLLIB

There are a number of restrictions on the xml files that XMLLIB can parse. In particular, XMLLIB cannot parse:

Any type of attributes. As an example, XMLLIB cannot parse <species mass="2" charge="1"/> . This information can instead be represented using following XML structure:

<species>
  <mass>   2 </mass>
  <charge> 1 </charge>
</species>

XMLLIB cannot parse arrays of elements. Example:

<family>
<person><name> Bob </name></person>
<person><name> Nick </name></person>
</family>

cannot be parsed. Instead the two persons may be described as an array of names:

<family><names> Bob , Nick </names></family>

2.1. Restrictions in the xml format of the xml2eg interface

Using the xml2eg interface the root element has to be /parameters , i.e. only data under /parameters can be accessed using the xml2eg interface.

2.2. Restrictions in the format of the xml-schema for Konz and Xpath interfaces

The Xpath and Konz interfaces have restrictions on how the xml-schema file should be written. The main restriction is that any child-xml-element has to be specified using a reference, ref ,to a different element on the root level.

Example: Below the element parameter has an explicitly declared child node , thus it cannot be parsed with the Konz or Xpath interfaces.

<xs:element name="parameters">
  <xs:complexType>
    <xs:all>
      <xs:element name="node" type="xs:float"/>
    </xs:all>
  </xs:complexType>
</xs:element>

By moving the declaration outside the parameter element it can be parsed.

<xs:element name="parameters">
  <xs:complexType>
    <xs:all>
      <xs:element ref="node" minOccurs="1"/>
    </xs:all>
  </xs:complexType>
</xs:element>

<xs:element name="node" type="xs:float"/>

One consequence of this limitation is that one cannot use the same name for two fields in different branches of the xml-tree unless both fields have identical format. As an example, there is no way to parse anything similar to the xml-tree below without renaming one of the elements called node .

<xs:element name="parameters">
  <xs:complexType>
    <xs:all>
      <xs:element name="integer">
        <xs:all>
          <xs:element name="node" type="xs:integer"/>
        </xs:all>
      </xs:element>
      <xs:element name="float">
         <xs:all>
           <xs:element name="node" type="xs:float"/>
         </xs:all>
      </xs:complexType>
    </xs:all>
  </xs:complexType>
</xs:element>

3. Interfaces

The interfaces provided in XMLLIB includes both subroutines to read input files to a buffer and to parse these, i.e. extract values from the data.

3.1. Reading input files

The XMLLIB was originally built for EFDA-ITM (later EUROfusion/WPCD) datastructures called CPOs and later adapted for the ITER/IMAS datastructures, called IDSs.
In both these dataformats a string is referred to as an array of 132-bit characters, i.e. character(len=132), pointer, dimension(:). Consequently the XMLLIB represent the xml the same way, as an CPO or IDS string.

There are two interfaces for reading xml input files. The first one reads a single file into a buffer:

use f90_file_reader, only: file2buffer
character(len=132), pointer :: buffer(:) => NULL()
integer :: io_unit = 1
call file2buffer('data.xml', io_unit, buffer)

The second interface reads three files, an xml file, a schema file and a default or reference xml file, into three buffers, param_xml, param_xsd and param_default.

use xml_file_reader, only: fill_param
character(132), pointer :: param_xml(:) => NULL()
character(132), pointer :: param_xsd(:) => NULL()
character(132), pointer :: param_default(:) => NULL()
call fill_param(  param_xml  ,  param_xsd  ,  param_default , &
                 'input.xml' , 'input.xsd' , 'input_default.xml' )

The second design of the second interface is motivated by the CPO and IDS data structures includes three files as described above. As an example the IDS derived type ids_parameters_input includes the three strings parameters_value, schema and parameters_default. The

type(ids_parameters_input), intent(in) :: codeparam
call file2buffer('input.xml',                'input.xsd',      'input_default.xml', &
                  codeparam%parameters_value, codeparam%schema, codeparam%parameters_default)

3.2. The xml2eg interface

To parse data using the xml2eg interface one first translate the xml data into an abstract document of type type_xml2eg_document using the subroutine xml2eg_parse_memory.

Once the data is read into the document data fields can be accessed using the subroutine xml2eg_get. This subroutine is polymorphic and can read strings, scalar and arrays of integers, single and double precision floats and booleans. To read a particular value from the xml-tree, use an xpath format (e.g. tree/branch/leaf ). Note however, that the xml2eg format assumes that the root level is called parameters and this root level is implicit in the xpath. Thus, while the absolute xpath would be /parameters/tree/branch/leaf the value of the leaf is access by requesting tree/branch/leaf.

The interface to xml2eg_get is:

subroutine xml2eg_get(xml2eg_document, path, out, errorflag)
type(type_xml2eg_document) :: xml2eg_document
character(len=*), intent(in) :: path
<OutType>, intent(out) :: out
logical, optional :: errorflag

Here <OutType> is either integer, integer, dimension(:), real(r4), real(r4) dimension(:), real(r8), real(r8), dimension(:), boolean, boolean, dimension(:) , character(:), where r4 and r8 are defined in the module xmllib_types.

The most simple usage of xml2eg is then:

use xml2eg_mdl, only: xml2eg_parse_memory, xml2eg_get, &
type_xml2eg_document
type(type_xml2eg_document) :: doc
character(len=132), dimension(:), pointer ::parameters
integer :: val
call xml2eg_parse_memory( parameters , doc )
call xml2eg_get(doc , 'tree/branch/leaf' , val)

The xml2eg interface also includes error handling. When calling xml2eg_get one may provide a fourth (optional) argument, which is a output argument that returns a boolean error flag (true is the reading failed and false it is was successful).

logical :: execution_error
...
call xml2eg_get(doc , 'tree/branch/leaf' , val, execution_error)
if (execution_error) then
...
end if

Once finished reading the xml-document all allocated data has to be freed. This is achieved by calling xml2eg_free_doc.

call xml2eg_free_doc(doc)

Page tree

XMLLib - parsing XML in Fortran