Avoiding The Needless Multiplication Of Forms

Automated Testing In Docker

Dr Andrew Moss


So what is Docker?
Have you heard of Docker? You probably have—everybody’s talking about it. It’s the new hotness. Even my dad’s like, “what’s Docker? I saw someone twitter about it on the Facebook. You should call your mom."
--- Chris Jones
Hmmm. Still not entirely clear? Imagine that you want to completely isolate a process from the rest of your machine. It should not be able to interact with any other process, or the file-system, or access the network. First lets think how we would do this. The most obvious approach is to build a virtual machine with a separate OS install within it. The VM will form a boundary around the process running inside and prevent it from accessing anything that we don't set up explicitly. Cool. Now lets think of how clunky it would be to use.
The virtualised machine has separate I/O - e.g. a window pretending to be a monitor and mouse movements mapping to a virtual device. If we want to interact with it through a terminal we need to host sshd inside the machine to let us in. This is quite a complex mess just to run a command in an isolated environment.
Docker takes a different approach - a container is not a fully virtualised machine. Rather than simulate a computer for an OS to run within (e.g. as VirtualBox does) Docker relies on the kernel partitioning the machine into isolated pieces that cannot communicate with one another. Is this as secure as real virtualisation? Well that is largely an open question right now - there are known exploits against hypervisors, and it is probable that the namespaces support in the kernel will get a lot testing and patching. It is quite hard to quantify which approach is currently the more secure, and which has the potential to be most secure. (Random aside: if you are a CivEng student in security looking for a project idea then this would be viable - get in touch if you are interested).
Docker uses a client-server architecture - a resident daemon handles these lightweight virtual machines that partition up resources. The file-system of the host machine is not partitioned, instead separate mount-points are used to access an entirely separate file-system in loop-back mode. The really neat part here is the use of a union file system. Standard chunks of the OS can be mounted read-only (and thus shared between different machines) with the writable parts of the current application sitting in overlaid layers.
So how does all of this help us in isolated testing and reproducibility?
The lowest layer in the file-system will be a standard disk image of a particular OS install. We will then turn this docker image into a docker container which can be specialised as a dev environment to build the units under test. This container will not be used directly - every test will clone this container to produce a one-off environment that runs the test, collects the results and then gets destroyed. This guarantees that state cannot propagate between tests; the actions in one test cannot pollute the behaviour of another.
Gosh this sounds complicated, thankfully we live in the future and the developers of Docker have made this all insanely easy:
$ docker run -it --name moon_base1 ubuntu /bin/bash
Docker makes really nice randomised names for container automatically, but we will want to refer back to this frequently as our starting container. The name ubuntu is a tag in the Docker repository that will links to ubuntu:latest. When I run it today Docker resolves this to 14.04.2, works out which parts of the union fs are missing and downloads them for me. The -i means send my terminal input into the container, and the -t means use the terminal for output.
root@b8e173f8481f:/# uname -a Linux b8e173f8481f 3.14-2-amd64 #1 SMP Debian 3.14.15-2 (2014-08-09) x86_64 x86_64 x86_64 GNU/Linux
Absolutely awesome, the process of building the machine is completely automated, and the /bin/bash part of the run command-line has launched a shell process within the container with its standard streams mapped back to the controlling terminal. Gandalf's beard! So what happens next is the really interesting bit though:
So passing the EOF to bash kills it as it does in a normal terminal, and as that was the running (encapsulated) process Docker has killed the container. This is the standard use-case for Docker (application deployment) so handling one-shot containers is a joy. Even better though, is the way in which Docker has killed the container. It is not deleted from the system until we tell it, but rather it is sitting in a frozen state:
$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b8e173f8481f ubuntu "/bin/bash" 2 hours ago Exited (0) 2 hours ago moon_base1 $ docker restart moon_base1 $ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b8e173f8481f ubuntu "/bin/bash" 2 hours ago Up 2 seconds moon_base1
Docker remembers that when we specified the /bin/bash binary to hold in the container we told it to bind the standard streams. When it restarts the process the streams are bound once again. People who are used to working in a screen session because ssh tends to drop long-term connections will then become really quite moist with excitement:
$ docker reattach moon_base1 root@b8e173f8481f:/#
So Docker gives us the tools that we need to quickly setup, tear-down and reuse containers that wrap up particular installations of an OS. Next we need to specialise them to the particular environment to run a single test.