1. Container image
- Images are binary files containing all the data and metadata required to start the container
- They can be built locally or downloaded from remote locations
- The most common standard is the Docker image format
1.1. Docker image format
Docker image is a
tar
archive with metadata and layersEach layer consists of its own metadata and another
tar
archive with the set of changes the layer introducesThe first metadata file in the image is
manifest.json
:[ { "Config": "f63181f19b2fe819156dcb068b3b5bc036820bec7014c5f77277cfa341d4cb5e.json", "RepoTags": [ "ubuntu:latest" ], "Layers": [ "151ae8ef4f042fd5173fd2497f0a365b4413468163e7bd567146f29dcfea3517/layer.tar", "2872658e1abe34d0c7391abbc0848fdeddb456659e39511df0574fcfc8b7ad70/layer.tar", "2b83a9243dd8405d0811beeb14aeb797745b100e4538d056adb63fcc6b47c59f/layer.tar" ] } ]
It contains:
Config
-- path to configuration file (architecture requirements, etc.),RepoTags
-- the list of tags used,Layers
-- paths totar
files containing layer information
1.2. OCI image format
An alternative format was proposed by OCI (Open Container Initiative)
It is also a
tar
archive containing metadata and layers in the form of embeddedtar
archivesThe first metadata file is
index.json
:{ "schemaVersion": 2, "manifests": [ { "mediaType": "application/vnd.oci.image.manifest.v1+json", "digest": "sha256:7ad481b55901a1b5472c0e1b3fbf0bf2867dc38feb6eb7a18cd310f00208e05c", "size": 658 } ] }
The manifest contains paths to configuration and layers:
{ "schemaVersion": 2, "config": { "mediaType": "application/vnd.oci.image.config.v1+json", "digest": "sha256:10bdc2317d43a5421151e135881e172002c7d61e934de7e1e79df560a151f112", "size": 2427 }, "layers": [ { "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip", "digest": "sha256:f3f8f4bd7c131f4d967bc162207ab72c24f427915682f895eb4f793ad05d7e35", "size": 29989546 }, { "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip", "digest": "sha256:0188b501936213b7cd0b5333245960781a8b035249cfa427fe9a229fe557c624", "size": 924 }, { "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip", "digest": "sha256:db861e57845ea7ba52a2ac277abbdd8cd04bda5db69c49bf95be49d11e5a47e1", "size": 202 } ] }
1.3. Local vs remote
Both formats describe how the image is stored as a local file
When transferring the image from remote server, the client asks for a list of layers, then checks its cache contents and finally downloads only the missing layers
This allows to reuse layers in images depending on each other
For example:
- Let's say image
A
is based onubuntu
- Image
B
extendsA
by adding Python executable on top - Image
C
also extendsA
, but it adds Apache HTTP server instead - A first-time user of
B
image will download layers fromubuntu
, then fromA
and finally fromB
- When he/she wants to use
C
image, only the layers with Apache HTTP server will be downloaded as all previous are still in cache
- Let's say image
if both mybackend
and myfrontend
images are based on ubuntu
, then the ubuntu
layers are downloaded and cached only ones
2. Building container images
- The most straightforward approach is by using the
Dockerfile
format - The
Dockerfile
is a text file with commands describing the recipe to build the image - The first command is
FROM <image>
which instructs what to base the image on - The
RUN <command>
executes a command in the builder context - The
CMD <command>
configures the default command a container will run upon creation - The
EXPOSE <port>
adds metadata about a port a service inside the image will listen on (Important! The container creator decided which ports to publish and how. Exposing a port inDockerfile
serves as a form of documentation.) - The
ENV <key>=<value>
sets environment variables' values - The
COPY <src> <dest>
allows to copy files from the host to the image - The
WORKDIR <dir>
sets the current working directory as seen inside the container - The
ARG <name>=<default>
configures a build-time argument that can be changed by the image builder
2.1. Example
Contents of
index.html
:<h1>Hello World from Docker!</h1>
Contents of
Dockerfile
:FROM ubuntu ENV DEBIAN_FRONTEND=noninteractive COPY index.html /var/www/html/index.html RUN apt-get update -y RUN apt-get install -y apache2 EXPOSE 80 CMD ["/usr/sbin/apachectl", "-DFOREGROUND"]
3. Optimization
The layer system allows caching, but it also has its consequences you need to be aware of
Scenario 1. Image
A
creates/bigfile
, imageB
extendingA
deletes it. This fact is merely masked -- i.e. the user ofB
will not see/bigfile
, but the file will still be part of imageB
and it will still take a lot of spaceScenario 2. One of the layers contains secrets (passwords, unencrypted private keys, etc.), the next layers delete them. Even though the secretes are not directly available, one can extract
tar
archives of every layer and get access to them anywayIn
Dockerfile
, each command creates a separate layer, so you should usually do this:Combine subsequent
RUN
commands:-RUN touch /test1 -RUN date > /test2 +RUN touch /test1 && date > /test2
In a combined
RUN
, make sure to delete all temporary files:-RUN apt-get update -y -RUN apt-get install -y git -RUN rm -rf /var/lib/apt/lists/* +RUN apt-get update -y \ + && apt-get install -y git \ + && rm -rf /var/lib/apt/lists/*
-RUN curl URL > archive.tar -RUN tar xf archive.tar -RUN rm archive.tar +RUN curl URL | tar x
Building procedure also makes use of caching
Scenario 1:
- Layer 1: Install Apache HTTP server
- Layer 2: Copy
index.html
Scenario 2:
- Layer 1: Copy
index.html
- Layer 2: Install Apache HTTP server
- Layer 1: Copy
Both scenarios will create equivalent images, but if you change
index.html
, then in Scenario 2 both layers will be rebuilt, while in Scenario 1 only the second oneA good practice is to order the layers according to probability of being changed (the more probable, the later should it be)
3.1. Example after optimization
FROM ubuntu
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -y \
&& apt-get install -y \
apache2 \
&& rm -rf /var/lib/apt/lists/*
EXPOSE 80
CMD ["/usr/sbin/apachectl", "-DFOREGROUND"]
COPY index.html /var/www/html/index.html
4. Multi-stage Dockerfiles
Sometimes to build an image you need to generate the resources e.g. compile and build a JAR file from a Java project
In such cases, usually the source codes and a set of tools required to process them is not necessary in the final image
To optimize the image, you could try performing the following in a single
RUN
command:- Transfer source codes
- Install build-time dependencies (compilers, etc.)
- Build everything
- Transfer the generated resource to its final destination
- Remove all immediate files
This is a lot to be done in a single command, so it is prone to errors and hard to debug
To overcome this, you can use multi-stage building, which is like building multiple images simultaneously and freely transferring files between them
Each stage starts with its own
FROM <image> AS <stage>
commandYou can copy files between images created in separate stages using
COPY --from=<stage>
syntax
4.1. Example
Contents of
hello.go
:package main import "fmt" func main() { fmt.Println("hello world") }
Contents of
Dockerfile
:FROM golang AS builder COPY hello.go hello.go RUN go build hello.go FROM ubuntu COPY --from=builder /go/hello /usr/bin/hello CMD /usr/bin/hello
5. BuildKit
Starting with version 18.09, Docker is shipped with two build engines: the legacy one (used by default) and the BuildKit
To use the new engine, you have to set the following environment variable
DOCKER_BUILDKIT=1
The new engine has the following advantages:
Independent stages' steps are executed in parallel
You can pass private SSH keys to the build process and be sure they do not end up in any layer or metadata:
RUN --mount=type=ssh <command>
docker build --ssh default .
Similarly, you can pass any secrets to the build process:
RUN --mount=type=secret,id=mysecret cat /run/secrets/mysecret
docker build --secret id=mysecret,src=file.txt .
And the following disadvantage:
- It is harder to debug problems
6. Debugging (legacy engine)
- Many things might go wrong during Docker image preparation
- When building with the legacy engine, each command in Dockerfile creates a layer, which gets stored under unique id
- If something goes wrong, you can instantiate an interactive container from the last layer that was built successfully and look for clues
6.1. Example
Contents of
Dockerfile
FROM alpine RUN date > /tmp/build-date.txt RUN cat /tmp/build-dat.txt > /tmp/final-date.txt
Results of
docker build .
:Sending build context to Docker daemon 2.048kB Step 1/3 : FROM alpine ---> 14119a10abf4 Step 2/3 : RUN date > /tmp/build-date.txt ---> Running in 3fe480f490d7 Removing intermediate container 3fe480f490d7 ---> 7eadcff6ea01 Step 3/3 : RUN cat /tmp/build-dat.txt > /tmp/final-date.txt ---> Running in 7cba04ecd3e3 cat: can't open '/tmp/build-dat.txt': No such file or directory The command '/bin/sh -c cat /tmp/build-dat.txt > /tmp/final-date.txt' returned a non-zero code: 1
The result of
FROM
command is stored as14119a10abf4
The result of the first
RUN
command is stored as7eadcff6ea01
The second
RUN
fails, so you debug it by starting an interactive container (the-it
switch):docker run -it 7eadcff6ea01
Now you can try to execute the command that failed:
cat /tmp/build-dat.txt
Then figure out why it failed, what could cause it, etc.
7. Debugging (BuildKit)
- With BuildKit layers are no longer stored after each command in the Dockerfile
- To improve performance, BuildKit only stores the image when a stage is finished
- To debug problems, you have to abuse this rule and introduce artificial stage beginnings and ends
7.1. Example
For the same
Dockerfile
as before, with BuildKit you will see this output (for TTY output styleplain
configured by runningdocker build --progress plain .
):#2 [internal] load .dockerignore #2 sha256:28b059ecac284a33ba98daa285c6a068d86485b54afc2e67f18e2bd1640d871a #2 transferring context: 2B done #2 DONE 0.1s #1 [internal] load build definition from Dockerfile #1 sha256:cbd3d6400308afcf33c0910b894d2e44156fc4127a0db290d19df5a4e8eae37e #1 transferring dockerfile: 37B done #1 DONE 0.3s #3 [internal] load metadata for docker.io/library/alpine:latest #3 sha256:d4fb25f5b5c00defc20ce26f2efc4e288de8834ed5aa59dff877b495ba88fda6 #3 DONE 0.0s #4 [1/3] FROM docker.io/library/alpine #4 sha256:665ba8b2cdc0cb0200e2a42a6b3c0f8f684089f4cd1b81494fbb9805879120f7 #4 CACHED #5 [2/3] RUN date > /tmp/build-date.txt #5 sha256:684d70446f71e64256b21f59555e6fedc1eac55780675519af54f9e174fd16e1 #5 DONE 1.2s #6 [3/3] RUN cat /tmp/build-dat.txt > /tmp/final-date.txt #6 sha256:bde8a93c3b05727094dd3d24e010c506f451a33718cc07f32f4e6b1ccab0b645 #6 0.925 cat: can't open '/tmp/build-dat.txt': No such file or directory #6 ERROR: executor failed running [/bin/sh -c cat /tmp/build-dat.txt > /tmp/final-date.txt]: exit code: 1 ------ > [3/3] RUN cat /tmp/build-dat.txt > /tmp/final-date.txt: ------ executor failed running [/bin/sh -c cat /tmp/build-dat.txt > /tmp/final-date.txt]: exit code: 1
Although you can notice lines starting with
sha256:...
, they do not correspond to intermediate layers' idsTo be able to debug the problem as previously, you have to alter the
Dockerfile
by (1) naming the debugged stage if not yet done and (2) addingFROM scratch
line just before the command you need to debug:-FROM alpine +FROM alpine AS debug RUN date > /tmp/build-date.txt +FROM scratch RUN cat /tmp/build-dat.txt > /tmp/final-date.txt
Now you can set BuildKit to only consider
debug
target:docker build --progress plain --target debug .
#1 [internal] load build definition from Dockerfile #1 sha256:ffe58018ac4c453ce043471e51216f7528c1fe315a9a83fa1ad276df0ac9f8a6 #1 transferring dockerfile: 157B done #1 DONE 0.1s #2 [internal] load .dockerignore #2 sha256:dda1a34cfb4f2eb8169d58953a937062de5c70a0ae78ca49118e49ea8279a7b7 #2 transferring context: 2B done #2 DONE 0.2s #3 [internal] load metadata for docker.io/library/alpine:latest #3 sha256:d4fb25f5b5c00defc20ce26f2efc4e288de8834ed5aa59dff877b495ba88fda6 #3 DONE 0.0s #4 [debug 1/2] FROM docker.io/library/alpine #4 sha256:665ba8b2cdc0cb0200e2a42a6b3c0f8f684089f4cd1b81494fbb9805879120f7 #4 DONE 0.0s #5 [debug 2/2] RUN date > /tmp/build-date.txt #5 sha256:684d70446f71e64256b21f59555e6fedc1eac55780675519af54f9e174fd16e1 #5 CACHED #6 exporting to image #6 sha256:e8c613e07b0b7ff33893b694f7759a10d42e180f2b4dc349fb57dc6b71dcab00 #6 exporting layers #6 exporting layers 0.4s done #6 writing image sha256:0846d70d4f562fa95835da5893cb53beb7f623cf228f57460bd298a0f87da680 0.0s done #6 DONE 0.5s
Finally you can start an interactive container by taking the hash value of the written image:
docker run -it 0846d70d4f562fa95835da5893cb53beb7f623cf228f57460bd298a0f87da680
8. IMAS Docker
8.1. General remarks
- The IMAS Docker build project is available at git.iter.org as
IMEX/imas-container
- It contains
ansible-container/
,buildah/
anddocker/
subdirectories - Only the last one is actively developed
- The
Dockerfile
requires BuildKit as it uses--mount=type=ssh
- This also means that if you want to build IMAS Docker, you need to configure SSH agent and add there the private key which gives access to git.iter.org projects
8.2. build.sh
This Bash script accepts the following command line options:
-f disable cache (build everything from scratch) -u build with UDA -c CPUs number of CPUs [default=$(nproc)] -t target build only one target
There are three auxiliary Bash functions defined:
latest_git_tag url blacklisted
. Returns the latest (in the meaning ofsort --version-sort
) tag in a git repository given with theurl
. Ifblacklisted
is given, such tag will be ignored. For example, the UDA repository has tagcode_camp_cadarache
which should be ignored.latest_stable_git_tag url
. As above, but the tag has to containstable
keyword (applicable for MDS+)latest_released_git_tag url
. As above, but the tag has to containrel
keyword (applicable for Ant)
Next, in the
build.sh
script you can set either of these tags to specific values:tag_al=4.8.7 # FIXME: 4.8.7 is used by ETS tag_ant=1.10.6 # FIXME: later versions of ant fail to build... tag_blitz= tag_cmake=v3.20.0 tag_dd=3.31.0 # FIXME: 3.31.0 is used by ETS tag_fc2k= tag_hdf5=hdf5-1_12_0 tag_installer= tag_kca= tag_kepler_installer= tag_kp= tag_lapack= tag_mdsplus=stable_release-7-96-15 # FIXME: later versions of mdsplus fail to build... tag_tigervnc= tag_uda=2.3.1 # FIXME: uda/2.3.1 is known to work well with uda-plugins/1.2.0 tag_uda_plugins=1.2.0 # FIXME: uda/2.3.1 is known to work well with uda-plugins/1.2.0 ver_kb=
All unset values will be checked using functions defined previously
8.3. Dockerfile
There are 14 stages in the
Dockerfile
, all based on Ubuntu 18.04common-builder
has compilers and libraries for building Blitz++, HDF5 and MDS+. It installs CMake from GitHub, because the version in Ubuntu 18.04 repo is too oldblitz-builder
builds Blitz++ in/opt/blitz
hdf5-builder
builds HDF5 in/opt/hdf5
mdsplus-builder
builds MDS+ in/opt/mdsplus
imas-git-puller
pulls all repositories from git and it is the only stage that accesses SSH keysbase
contains a long list of applications, libraries and environment variables that will be used by any otherimas/*
imagebase-devel
adds Intel compilers on top of that and it uses them to compile BLAS and LAPACKual-devel
does the following: (1) compiles UDA and UDA Plugins, (2) compiles IMAS without Fortran interface, (3) compiles Fortran interface, (4) installs IMAS, (5) builds UDA Plugins again- UDA has a cyclic dependency: IMAS requires UDA, UDA Plugins require IMAS
- Fortan interface takes the longest to compile and requires the most amount of RAM. If you have trouble building the image, set CPU count in parallel building to a smaller value (see
-c CPUs
inbuild.sh
description)
kepler-devel
adds Ant, JAXFront and Kepler- JAXFront is a licensed software and IMAS Docker uses the free edition
fc2k-devel
adds FC2Kual
starts frombase
image and copies fromual-devel
all things built previously (BLAS, LAPACK, UDA, IMAS)- This way the
ual
image is free of the*-devel
software (i.e. the Intel compilers) and IMAS source codes
- This way the
kepler
extends it and copies fromkepler-devel
fc2k
extends it and copies fromfc2k-devel
gui
extends it and adds XFCE4 and TigerVNC
There are 7 images produced:
imas/ual-devel
imas/kepler-devel
imas/fc2k-devel
imas/ual
imas/kepler
imas/fc2k
imas/gui
8.4. files/{base,ual,kepler,fc2k,gui}
There are several files used in
Dockerfile
available infiles/*
directoriesIn
files/base
you can findblas-ifort.pc
andlapack-ifort.pc
files preconfigured for BLAS and LAPACK (they do not come with the upstream package, so they were crafted manually)In
files/kepler
you can find modulefile for JAXFront (it also had to be crafted manually)In
files/{ual,kepler}
you can findMakefile.Docker.Ubuntu
which is a configuration file used by IMAS Installer or Kepler Installer- In these files you select what to build e.g. you can switch off
gfortran
compilation of Fortran interface
- In these files you select what to build e.g. you can switch off
In
files/fc2k
you can findinstall_Docker.Ubuntu.xml
andsettings_Docker.Ubuntu.xml
which control the installation and usage of FC2KIn
files/{ual,kepler,fc2k}
you can finddocker-entrypoint.sh
which is set to be the entrypoint of corresponding images. The role of these scripts is to load necessary modules (e.g. IMAS or UDA) and executeimasdb test
8.5. push.sh
and save.sh
- The Bash script
push.sh
tags all images with registry prefix and pushes them there - The Bash script
save.sh
saves the non-devel images as.tar.zst
in/tmp
directory