Docker

Go to:

What is it? Top

The system of virtual machines (VM) provided a simple way to configure and obtain the same working environment in different computers. Running the same VM image on two computers allows a user to use its own libraries and programs without the need of installing them on each computer, but the price is high: a VM contains a whole operating system and all the required software, and it may occupy a lot of space. This is a problem every time you have to copy and run the VM into a new computer: a lot of resources are used.

A relatively new idea is to avoid creating a VM with the whole operating system and to use the already existing kernel as a starting point, instead. What the user must create and use is a "container" that stores only the software layers that must be present above the common kernel. The virtualization is provided above the Linux kernel and concerns only the additional layers. Each container can be run independently, so inside one computer different users can run their pre-configured environments without any interference. The advantage is that a container is typically much lighter than a VM.

Docker is one among the various softwares that provide these container systems. Its popularity increased a lot in the last few years. A docker container can pack an application with all its dependencies inside a lightweight sandbox, that can be executed in any computer where the Docker engine is present.

Open an existing image Top

The Docker community provides nowadays a lot of publicly available containers, from the most simplest ones ("hello-world", for example) to the most complex ones: you can search the Docker Hub to find a number of existing container that you can use. Examples are the "ubuntu" or "debian" containers, that contain the basic software and configuration from the corresponding operating systems, or the "python", "r-base", "wordpress", "java", "php", "perl" containers and many more.

The base command to use docker is docker *, where docker --help lists a lot of available options and commands

To use one of these containers, you can docker run the container. If you want to open a bash shell inside an ubuntu container (latest version), you have to run the container interactively (it is not the only possibility, you can also simply run an application or a service inside the container):

docker run -it ubuntu:latest bash

Note that you will be inside the container as the root user.

Each time you run the ubuntu:latest container you will find the same environment. Any command that you entered in the previous session will be forgotten, unless the container has been committed (see below).

Inside the container you have a bash shell inside an independent framework, that do not interfere with your OS. You can run all the commands that you want and they will be completely separated by the external environment. The only contacts are the kernel and possibly the filesystem. If you need to access some files/folders that are in your filesystem from inside the container, indeed, you have to mount a volume. This will not modify the existing image, but you will have access to the external filesystem when running a new container:

docker run -it -v /local/dir:/virt/path ubuntu:latest bash

Each time you exit a container, it is stopped and leaved in the memory for any further operation, such as a commit or a resume. The existing containers can be listed with docker ps -a and deleted with docker rm container-ID. In the same way, images are listed with docker images and deleted with docker rmi image-ID.

Repository Top

If the ubuntu:latest container is not available locally, it is retrieved from a repository. The default repository is the Docker Hub, which contains a huge number of official and non-official containers. Any registered user can submit its own containers and share them with the whole community.

It is also possible to create a local (private) registry and use it to share locally the containers that should not be made public. The simplest reason for this is that containers with proprietary software that should not be published can be shared inside a local network using a local registry, instead of being shared with the whole world.

In Torino we have a private registry hosted by to4pxl inside the INFN network. To use it, you should tag the image that you want to save locally and then pull it to the local repository.

To do this, use to4pxl:5000/imagename as an additional tag. For example, to tag the ubuntu:latest image to be saved locally, use

docker tag ubuntu:latest to4pxl:5000/ubuntu:latest

You have then to push the image to the local repository to make it available for a following pull:
docker push to4pxl:5000/ubuntu:latest

The interaction with the local registry (listing images or their properties, deleting objects and more) is done using HTTP communications as explained in this page. An example:

Create an image Top

There are two ways to create an image: opening an existing image to manually perform the necessary operations and commit the changes to a new image, or using a Dockerfile and automatically build it.

The first method is simplest, but does not allow to automate the build. In some cases, this results in a poor optimization of the layers and in an increased image size. I will not cover these topics, you can find some references at the end of the page.

Start opening an interactive container, based on an existing image. Perform all the operations that are needed to set your environment as you would do in a physical computer. At the end, exit the container. You can commit the changes and save a new image with:

docker commit -m "commit message" -a "Author Name" container-name-or-ID username/image-name:latest

container-name-or-ID indentifies the container that you want to convert into an image. The last part, username/image-name:latest, is used to say Docker that image-name, version latest, has been created by the user that has an username account on Docker Hub. This is required to add the new image to your account using docker pull, but if you don't need to pull it to the Docker Hub, you can use whatever you want as username. If omitted, the version is always latest.


The second method does not require to run an interactive container and to customize it manually. You can create a Dockerfile and use it to automatically build a new image.

The command to build an image label and to set the tag to pull it to our local repository is

docker build -f dockerfile -t label -t to4pxl:5000/label .

that must be run in the folder where dockerfile is saved. The official reference on how to write a docker file is here.

Docker and HTCondor Top

Docker is particularly interesting since HTCondor, since version 8.3.6, has a docker universe. It is basically the same as the vanilla universe, but the job is executed inside a Docker container. Again, the advantage is that your application will work in any computer without the need of installing some additional software, since the dependencies are included in the container.

See the official HTCondor manual on how to submit a docker universe job. There are a few differences with respect to the submission of a vanilla universe job.

References Top