Pluggable Tools with Docker Data Containers

There are some apps that have a simple installation process. When using them with other applications in Docker, they may be able to be installed in their own data volume container and used in a pluggable way.

The kind of apps I’m talking about are some Java apps (and in fact, Java itself) which follow this installation process:

  1. Install the contents of the app into a single directory
  2. Set an environmental variable to point to the installation directory, e.g. XXX_HOME
  3. Add the executables of the app to the PATH environmental

 

That’s it.

An example of an app installation that follows this pattern is Gradle:

  1. Uncompress the Gradle files from an archive to a directory.
  2. Set the enviromental variable GRADLE_HOME to point to the gradle installation directory
  3. Add GRADLE_HOME/bin to the PATH

 

Docker

Using Gradle as an example, here is a Dockerfile that installs it in a data volume container:

# Install Gradle as a data volume container. 
#
# The app container that uses this container will need to set the Gradle environmental variables.
# - set GRADLE_HOME to the gradle installation directory
# - add the /bin directory under the gradle directory to the PATH

FROM mini/base

MAINTAINER David Wong

# setup location for installation
ENV INSTALL_LOCATION /opt

# install Gradle version required
ENV GRADLE_VERSION 2.2.1

WORKDIR ${INSTALL_LOCATION}
RUN curl -L -O http://services.gradle.org/distributions/gradle-${GRADLE_VERSION}-bin.zip && \
    unzip -qo gradle-${GRADLE_VERSION}-bin.zip && \
    rm -rf gradle-${GRADLE_VERSION}-bin.zip
    
# to make the container more portable, the installation directory name is changed from the default
# gradle-${GRADLE_VERSION} to just gradle, with the version number stored in a text file for reference
# e.g. instead of /opt/gradle-2.2.1, the directory will be /opt/gradle

RUN mv gradle-${GRADLE_VERSION} gradle && \
    echo ${GRADLE_VERSION} > gradle/version
    
VOLUME ${INSTALL_LOCATION}/gradle

# echo to make it easy to grep
CMD /bin/sh -c '/bin/echo Data container for Gradle'

(From github https://github.com/davidwong/docker/blob/master/gradle/Dockerfile)

Build the image and container from the Dockerfile. Here I’ve tagged the image with the version number of the Gradle installation, and named the container gradle-2.2.1.


docker build -t yourrepo/gradle:2.2.1 .

docker run -i -t --name gradle-2.2.1 yourrepo/gradle:2.2.1

A few things to note about this installation:

  • I have changed the directory name where Gradle is installed from the default, by removing the version number in order to make it generic.
  • no environmental variables have been set, that will be done later
  • you can use any minimal image as the basis for the container, it justs need curl or wget in order to download the Gradle archive file

Now we have the Gradle installation in a docker data volume that can be persisted and shared by other containers.

You can then repeat this process with difference versions of Gradle to create separate data containers for each version (of course giving the containers different names, e.g. gradle-2.2.1, gradle-1.9, etc).

Use Case

I originally got this idea when I was running my Jenkins CI docker container. Some of the Jenkins builds required Gradle 2.x while others were using Gradle 1.x.

So instead of building multiple Jenkins + Gradle images for the different versions of Gradle required, I can now just run the Jenkins container with the appropriate Gradle data container. This is done by using –volumes-from to get access to the Gradle installation directory and setting the require environmental variables.

To use the data container with Gradle 2.2.1 installed:

docker run -i -t --volumes-from gradle-2.2.1 -e GRADLE_HOME=/opt/gradle -e PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/gradle/bin myjenkins</pre>

To use the one with Gradle 1.9:

docker run -i -t --volumes-from gradle-1.9 -e GRADLE_HOME=/opt/gradle -e PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/gradle/bin myjenkins</pre>

Of course there are limitations to this technique since Docker data volume containers were designed to share persistant data rather than application installs. In particular they do not allow sharing of environmental variables.

However this work around that can be useful in some circumstances.

Backup a Docker Data Container with Fig

I have been using data volume containers to persist data in docker containers.  There are various reasons why this tends to be a better option than just using data volumes, but probably the most important is portability.

Of course now we have to backup the data in the data containers. This can be for archiving, or when the containers that use the data need to be upgraded or recreated. If your backup requirements are simple you can simply use the docker cp command or something like tar.

A Jenkins example

As a simple example, let’s run a Jenkins server in a docker container and use a data volume container to persist its data.

1. Pull or build a Jenkins image from the official repository.

http://jenkins-ci.org/content/official-jenkins-lts-docker-image

2. The Jenkins images uses the directory /var/jenkins_home as the volume to store it’s data, so we need a data volume container for that volume. Here is a sample of a Dockerfile to build the data container:

Build and tag the image from the Dockerfile.

docker build -t your_repository:jenkins-data .

You can now create the data container, giving it a name for convenience. Optionally we can run the docker ps command afterwards to check that the container has been created, it should be in a stopped state.

docker run -i -t --name jenkins-data your_repository:jenkins-data
docker ps -a

3. Run the Jenkins server with the data container attached and make some changes, e.g. create a job, etc. The Jenkins data volume should have your changes in it now.

docker run --name=jenkins-sample -p 8080:8080 --volumes-from=jenkins-data jenkins

4. For this example we will use tar to backup the data container, using this command to create a temporary container to access the data container.

docker run -rm --volumes-from jenkins-data -v $(pwd):/backup busybox tar cvf /backup/jenkins_backup.tar /var/jenkins_home

There should now be a file jenkins_backup.tar in the current directory. Of course for real usage, we would probably run this command from a script and make it generic to be able to backup any data volume container.

I do give a fig …

Something else I use for development with Docker is the orchestration tool Fig (this has saved me a lot of typing!). So here is an example of  doing the same backup on the Jenkins data container using Fig.

1. Create Fig YAML file, using the same information that we used in the backup command.

2. Run Fig, that’s it!

fig up

This is a simple example that has only scratched the surface of what can be done with Docker (and Fig). If the backup requirements for the data is more complex, then you could also consider creating a dedicated container just for doing backups, with all the required tools installed in it.

The great thing about Docker is that once everything has been setup, you can get applications such as Jenkins up and running very quickly.