1. Container image

Images are binary files containing all the data and metadata required to start the container
They can be built locally or downloaded from remote locations
The most common standard is the Docker image format

1.1. Docker image format

Docker image is a tar archive with metadata and layers
Each layer consists of its own metadata and another tar archive with the set of changes the layer introduces

The first metadata file in the image is manifest.json:

[
  {
    "Config": "f63181f19b2fe819156dcb068b3b5bc036820bec7014c5f77277cfa341d4cb5e.json",
    "RepoTags": [
      "ubuntu:latest"
    ],
    "Layers": [
      "151ae8ef4f042fd5173fd2497f0a365b4413468163e7bd567146f29dcfea3517/layer.tar",
      "2872658e1abe34d0c7391abbc0848fdeddb456659e39511df0574fcfc8b7ad70/layer.tar",
      "2b83a9243dd8405d0811beeb14aeb797745b100e4538d056adb63fcc6b47c59f/layer.tar"
    ]
 }
]

It contains:
- Config -- path to configuration file (architecture requirements, etc.),
- RepoTags -- the list of tags used,
- Layers -- paths to tar files containing layer information

1.2. OCI image format

An alternative format was proposed by OCI (Open Container Initiative)
It is also a tar archive containing metadata and layers in the form of embedded tar archives

The first metadata file is index.json:

{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:7ad481b55901a1b5472c0e1b3fbf0bf2867dc38feb6eb7a18cd310f00208e05c",
      "size": 658
    }
  ]
}

The manifest contains paths to configuration and layers:

{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:10bdc2317d43a5421151e135881e172002c7d61e934de7e1e79df560a151f112",
    "size": 2427
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:f3f8f4bd7c131f4d967bc162207ab72c24f427915682f895eb4f793ad05d7e35",
      "size": 29989546
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:0188b501936213b7cd0b5333245960781a8b035249cfa427fe9a229fe557c624",
      "size": 924
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:db861e57845ea7ba52a2ac277abbdd8cd04bda5db69c49bf95be49d11e5a47e1",
      "size": 202
    }
  ]
}

1.3. Local vs remote

Both formats describe how the image is stored as a local file
When transferring the image from remote server, the client asks for a list of layers, then checks its cache contents and finally downloads only the missing layers
This allows to reuse layers in images depending on each other
For example:
- Let's say image A is based on ubuntu
- Image B extends A by adding Python executable on top
- Image C also extends A, but it adds Apache HTTP server instead
- A first-time user of B image will download layers from ubuntu, then from A and finally from B
- When he/she wants to use C image, only the layers with Apache HTTP server will be downloaded as all previous are still in cache

if both mybackend and myfrontend images are based on ubuntu, then the ubuntu layers are downloaded and cached only ones

2. Building container images

The most straightforward approach is by using the Dockerfile format
The Dockerfile is a text file with commands describing the recipe to build the image
The first command is FROM <image> which instructs what to base the image on
The RUN <command> executes a command in the builder context
The CMD <command> configures the default command a container will run upon creation
The EXPOSE <port> adds metadata about a port a service inside the image will listen on (Important! The container creator decided which ports to publish and how. Exposing a port in Dockerfile serves as a form of documentation.)
The ENV <key>=<value> sets environment variables' values
The COPY <src> <dest> allows to copy files from the host to the image
The WORKDIR <dir> sets the current working directory as seen inside the container
The ARG <name>=<default> configures a build-time argument that can be changed by the image builder

2.1. Example

Contents of index.html:
```
<h1>Hello World from Docker!</h1>
```

Contents of Dockerfile:

FROM ubuntu
ENV DEBIAN_FRONTEND=noninteractive
COPY index.html /var/www/html/index.html
RUN apt-get update -y
RUN apt-get install -y apache2
EXPOSE 80
CMD ["/usr/sbin/apachectl", "-DFOREGROUND"]

3. Optimization

The layer system allows caching, but it also has its consequences you need to be aware of
Scenario 1. Image A creates /bigfile, image B extending A deletes it. This fact is merely masked -- i.e. the user of B will not see /bigfile, but the file will still be part of image B and it will still take a lot of space
Scenario 2. One of the layers contains secrets (passwords, unencrypted private keys, etc.), the next layers delete them. Even though the secretes are not directly available, one can extract tar archives of every layer and get access to them anyway

In Dockerfile, each command creates a separate layer, so you should usually do this:

Combine subsequent RUN commands:

-RUN touch /test1
-RUN date > /test2
+RUN touch /test1 && date > /test2

In a combined RUN, make sure to delete all temporary files:

-RUN apt-get update -y
-RUN apt-get install -y git
-RUN rm -rf /var/lib/apt/lists/*
+RUN apt-get update -y \
+ && apt-get install -y git \
+ && rm -rf /var/lib/apt/lists/*

-RUN curl URL > archive.tar
-RUN tar xf archive.tar
-RUN rm archive.tar
+RUN curl URL | tar x

Building procedure also makes use of caching
Scenario 1:
- Layer 1: Install Apache HTTP server
- Layer 2: Copy index.html
Scenario 2:
- Layer 1: Copy index.html
- Layer 2: Install Apache HTTP server
Both scenarios will create equivalent images, but if you change index.html, then in Scenario 2 both layers will be rebuilt, while in Scenario 1 only the second one
A good practice is to order the layers according to probability of being changed (the more probable, the later should it be)

3.1. Example after optimization

FROM ubuntu
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -y \
    && apt-get install -y \
        apache2 \
    && rm -rf /var/lib/apt/lists/*
EXPOSE 80
CMD ["/usr/sbin/apachectl", "-DFOREGROUND"]
COPY index.html /var/www/html/index.html

4. Multi-stage Dockerfiles

Sometimes to build an image you need to generate the resources e.g. compile and build a JAR file from a Java project
In such cases, usually the source codes and a set of tools required to process them is not necessary in the final image
To optimize the image, you could try performing the following in a single RUN command:
- Transfer source codes
- Install build-time dependencies (compilers, etc.)
- Build everything
- Transfer the generated resource to its final destination
- Remove all immediate files
This is a lot to be done in a single command, so it is prone to errors and hard to debug
To overcome this, you can use multi-stage building, which is like building multiple images simultaneously and freely transferring files between them
Each stage starts with its own FROM <image> AS <stage> command
You can copy files between images created in separate stages using COPY --from=<stage> syntax

4.1. Example

Contents of hello.go:

package main
import "fmt"
func main() {
    fmt.Println("hello world")
}

Contents of Dockerfile:

FROM golang AS builder
COPY hello.go hello.go
RUN go build hello.go

FROM ubuntu
COPY --from=builder /go/hello /usr/bin/hello
CMD /usr/bin/hello

5. BuildKit

Starting with version 18.09, Docker is shipped with two build engines: the legacy one (used by default) and the BuildKit
To use the new engine, you have to set the following environment variable DOCKER_BUILDKIT=1
The new engine has the following advantages:
- Independent stages' steps are executed in parallel
- You can pass private SSH keys to the build process and be sure they do not end up in any layer or metadata:
```
RUN --mount=type=ssh <command>
```
```
docker build --ssh default .
```
- Similarly, you can pass any secrets to the build process:
```
RUN --mount=type=secret,id=mysecret cat /run/secrets/mysecret
```
```
docker build --secret id=mysecret,src=file.txt .
```
And the following disadvantage:
- It is harder to debug problems

6. Debugging (legacy engine)

Many things might go wrong during Docker image preparation
When building with the legacy engine, each command in Dockerfile creates a layer, which gets stored under unique id
If something goes wrong, you can instantiate an interactive container from the last layer that was built successfully and look for clues

6.1. Example

Contents of Dockerfile

FROM alpine
RUN date > /tmp/build-date.txt
RUN cat /tmp/build-dat.txt > /tmp/final-date.txt

Results of docker build .:

Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM alpine
 ---> 14119a10abf4
Step 2/3 : RUN date > /tmp/build-date.txt
 ---> Running in 3fe480f490d7
Removing intermediate container 3fe480f490d7
 ---> 7eadcff6ea01
Step 3/3 : RUN cat /tmp/build-dat.txt > /tmp/final-date.txt
 ---> Running in 7cba04ecd3e3
cat: can't open '/tmp/build-dat.txt': No such file or directory
The command '/bin/sh -c cat /tmp/build-dat.txt > /tmp/final-date.txt' returned a non-zero code: 1

The result of FROM command is stored as 14119a10abf4
The result of the first RUN command is stored as 7eadcff6ea01
The second RUN fails, so you debug it by starting an interactive container (the -it switch):
```
docker run -it 7eadcff6ea01
```
Now you can try to execute the command that failed: cat /tmp/build-dat.txt
Then figure out why it failed, what could cause it, etc.

7. Debugging (BuildKit)

With BuildKit layers are no longer stored after each command in the Dockerfile
To improve performance, BuildKit only stores the image when a stage is finished
To debug problems, you have to abuse this rule and introduce artificial stage beginnings and ends

7.1. Example

For the same Dockerfile as before, with BuildKit you will see this output (for TTY output style plain configured by running docker build --progress plain .):

#2 [internal] load .dockerignore
#2 sha256:28b059ecac284a33ba98daa285c6a068d86485b54afc2e67f18e2bd1640d871a
#2 transferring context: 2B done
#2 DONE 0.1s

#1 [internal] load build definition from Dockerfile
#1 sha256:cbd3d6400308afcf33c0910b894d2e44156fc4127a0db290d19df5a4e8eae37e
#1 transferring dockerfile: 37B done
#1 DONE 0.3s

#3 [internal] load metadata for docker.io/library/alpine:latest
#3 sha256:d4fb25f5b5c00defc20ce26f2efc4e288de8834ed5aa59dff877b495ba88fda6
#3 DONE 0.0s

#4 [1/3] FROM docker.io/library/alpine
#4 sha256:665ba8b2cdc0cb0200e2a42a6b3c0f8f684089f4cd1b81494fbb9805879120f7
#4 CACHED

#5 [2/3] RUN date > /tmp/build-date.txt
#5 sha256:684d70446f71e64256b21f59555e6fedc1eac55780675519af54f9e174fd16e1
#5 DONE 1.2s

#6 [3/3] RUN cat /tmp/build-dat.txt > /tmp/final-date.txt
#6 sha256:bde8a93c3b05727094dd3d24e010c506f451a33718cc07f32f4e6b1ccab0b645
#6 0.925 cat: can't open '/tmp/build-dat.txt': No such file or directory
#6 ERROR: executor failed running [/bin/sh -c cat /tmp/build-dat.txt > /tmp/final-date.txt]: exit code: 1
------
 > [3/3] RUN cat /tmp/build-dat.txt > /tmp/final-date.txt:
------
executor failed running [/bin/sh -c cat /tmp/build-dat.txt > /tmp/final-date.txt]: exit code: 1

Although you can notice lines starting with sha256:..., they do not correspond to intermediate layers' ids
To be able to debug the problem as previously, you have to alter the Dockerfile by (1) naming the debugged stage if not yet done and (2) adding FROM scratch line just before the command you need to debug:
```
-FROM alpine
+FROM alpine AS debug
 RUN date > /tmp/build-date.txt
+FROM scratch
 RUN cat /tmp/build-dat.txt > /tmp/final-date.txt
```

Now you can set BuildKit to only consider debug target:

docker build --progress plain --target debug .

#1 [internal] load build definition from Dockerfile
#1 sha256:ffe58018ac4c453ce043471e51216f7528c1fe315a9a83fa1ad276df0ac9f8a6
#1 transferring dockerfile: 157B done
#1 DONE 0.1s

#2 [internal] load .dockerignore
#2 sha256:dda1a34cfb4f2eb8169d58953a937062de5c70a0ae78ca49118e49ea8279a7b7
#2 transferring context: 2B done
#2 DONE 0.2s

#3 [internal] load metadata for docker.io/library/alpine:latest
#3 sha256:d4fb25f5b5c00defc20ce26f2efc4e288de8834ed5aa59dff877b495ba88fda6
#3 DONE 0.0s

#4 [debug 1/2] FROM docker.io/library/alpine
#4 sha256:665ba8b2cdc0cb0200e2a42a6b3c0f8f684089f4cd1b81494fbb9805879120f7
#4 DONE 0.0s

#5 [debug 2/2] RUN date > /tmp/build-date.txt
#5 sha256:684d70446f71e64256b21f59555e6fedc1eac55780675519af54f9e174fd16e1
#5 CACHED

#6 exporting to image
#6 sha256:e8c613e07b0b7ff33893b694f7759a10d42e180f2b4dc349fb57dc6b71dcab00
#6 exporting layers
#6 exporting layers 0.4s done
#6 writing image sha256:0846d70d4f562fa95835da5893cb53beb7f623cf228f57460bd298a0f87da680 0.0s done
#6 DONE 0.5s

Finally you can start an interactive container by taking the hash value of the written image:
```
docker run -it 0846d70d4f562fa95835da5893cb53beb7f623cf228f57460bd298a0f87da680
```

8. IMAS Docker

8.1. General remarks

The IMAS Docker build project is available at git.iter.org as IMEX/imas-container
It contains ansible-container/, buildah/ and docker/ subdirectories
Only the last one is actively developed
The Dockerfile requires BuildKit as it uses --mount=type=ssh
This also means that if you want to build IMAS Docker, you need to configure SSH agent and add there the private key which gives access to git.iter.org projects

8.2. `build.sh`

This Bash script accepts the following command line options:

-f            disable cache (build everything from scratch)
-u            build with UDA
-c CPUs       number of CPUs [default=$(nproc)]
-t target     build only one target

There are three auxiliary Bash functions defined:
- latest_git_tag url blacklisted. Returns the latest (in the meaning of sort --version-sort) tag in a git repository given with the url. If blacklisted is given, such tag will be ignored. For example, the UDA repository has tag code_camp_cadarache which should be ignored.
- latest_stable_git_tag url. As above, but the tag has to contain stable keyword (applicable for MDS+)
- latest_released_git_tag url. As above, but the tag has to contain rel keyword (applicable for Ant)

Next, in the build.sh script you can set either of these tags to specific values:

tag_al=4.8.7                        # FIXME: 4.8.7 is used by ETS
tag_ant=1.10.6                      # FIXME: later versions of ant fail to build...
tag_blitz=
tag_cmake=v3.20.0
tag_dd=3.31.0                       # FIXME: 3.31.0 is used by ETS
tag_fc2k=
tag_hdf5=hdf5-1_12_0
tag_installer=
tag_kca=
tag_kepler_installer=
tag_kp=
tag_lapack=
tag_mdsplus=stable_release-7-96-15  # FIXME: later versions of mdsplus fail to build...
tag_tigervnc=
tag_uda=2.3.1                       # FIXME: uda/2.3.1 is known to work well with uda-plugins/1.2.0
tag_uda_plugins=1.2.0               # FIXME: uda/2.3.1 is known to work well with uda-plugins/1.2.0
ver_kb=

All unset values will be checked using functions defined previously

8.3. `Dockerfile`

There are 14 stages in the Dockerfile, all based on Ubuntu 18.04
- common-builder has compilers and libraries for building Blitz++, HDF5 and MDS+. It installs CMake from GitHub, because the version in Ubuntu 18.04 repo is too old
- blitz-builder builds Blitz++ in /opt/blitz
- hdf5-builder builds HDF5 in /opt/hdf5
- mdsplus-builder builds MDS+ in /opt/mdsplus
- imas-git-puller pulls all repositories from git and it is the only stage that accesses SSH keys
- base contains a long list of applications, libraries and environment variables that will be used by any other imas/* image
- base-devel adds Intel compilers on top of that and it uses them to compile BLAS and LAPACK
- ual-devel does the following: (1) compiles UDA and UDA Plugins, (2) compiles IMAS without Fortran interface, (3) compiles Fortran interface, (4) installs IMAS, (5) builds UDA Plugins again
  - UDA has a cyclic dependency: IMAS requires UDA, UDA Plugins require IMAS
  - Fortan interface takes the longest to compile and requires the most amount of RAM. If you have trouble building the image, set CPU count in parallel building to a smaller value (see -c CPUs in build.sh description)
- kepler-devel adds Ant, JAXFront and Kepler
  - JAXFront is a licensed software and IMAS Docker uses the free edition
- fc2k-devel adds FC2K
- ual starts from base image and copies from ual-devel all things built previously (BLAS, LAPACK, UDA, IMAS)
  - This way the ual image is free of the *-devel software (i.e. the Intel compilers) and IMAS source codes
- kepler extends it and copies from kepler-devel
- fc2k extends it and copies from fc2k-devel
- gui extends it and adds XFCE4 and TigerVNC
There are 7 images produced:
- imas/ual-devel
- imas/kepler-devel
- imas/fc2k-devel
- imas/ual
- imas/kepler
- imas/fc2k
- imas/gui

8.4. `files/{base,ual,kepler,fc2k,gui}`

There are several files used in Dockerfile available in files/* directories
- In files/base you can find blas-ifort.pc and lapack-ifort.pc files preconfigured for BLAS and LAPACK (they do not come with the upstream package, so they were crafted manually)
- In files/kepler you can find modulefile for JAXFront (it also had to be crafted manually)
- In files/{ual,kepler} you can find Makefile.Docker.Ubuntu which is a configuration file used by IMAS Installer or Kepler Installer
  - In these files you select what to build e.g. you can switch off gfortran compilation of Fortran interface
- In files/fc2k you can find install_Docker.Ubuntu.xml and settings_Docker.Ubuntu.xml which control the installation and usage of FC2K
- In files/{ual,kepler,fc2k} you can find docker-entrypoint.sh which is set to be the entrypoint of corresponding images. The role of these scripts is to load necessary modules (e.g. IMAS or UDA) and execute imasdb test

8.5. `push.sh` and `save.sh`

The Bash script push.sh tags all images with registry prefix and pushes them there
The Bash script save.sh saves the non-devel images as .tar.zst in /tmp directory

Page tree

Docker image building & IMAS Docker