How to fully utilise Docker during development

Photo by Johnson Wang on Unsplash

TL:DR

Part 1

  • Use Docker to manage document and share your full development environment
  • Use Docker to run your CI tests
  • Use Docker layers to optimise the build process.
  • Use Docker’s multi-stage builds to build a production deployable component from source.

Part 2 (Coming soon)

  • Use Docker-compose to host/emulate your Production system.
  • Use a separate Docker-compose file to maintain your development/CI implementation.
  • Use Docker-compose to test your production components

If you don’t know Node…

The blog is about using Docker and Docker-compose, all the features can be applied generically to many development tech stacks, I have used the same pattern with Javascript, Python, Java and Scala projects.

In the examples I use a simple Node application, if you are not familiar with Node this is all you need to know.

  • Node is an execution environment and comes with a package manager called npm
  • The Package.json defines third-party dependencies and build scripts
  • Third-party dependencies are downloaded to a sub-directory of the project called node_modules (contrast to Maven which uses a configurable directory defaulting to a user directory common to many projects)
  • Typically these commands are used in the development workflow
    • npm install – download third-party dependencies to node_modules directory
    • npm start – execute program from source code
    • npm test – execute unit test suite
    • npm build – optimise and package for production

Problems / Motivation

Historically developers have had to download and install a whole suite of languages, compilers, databases and analysers to their local machine. They also installed a similar but often different toolset to one or more CI servers. Over time as the team gradually changes and the project evolves, and due to individuals being individuals, the CI servers and each developer machine is slightly different. Then the build breaks or a bug is raised to the sound of

Well, it worked on my machine

If you were a diligent team, you had a wiki page detailing how new starters could get up and running. But when the newbie did arrive they spent days (or weeks – I’m not exaggerating) to get up and running. Fortunately we had a project manager who knew a solution, “put it in the wiki, then the next newb will be okay”.

Then, when working on a new feature someone changes a dependency. Their build works so they push it. If you set up your CI system well you catch the issue but only after the developer has marked the task as done and left for the weekend. But maybe the CI build passes. It is then the newbie who does not have an old dependency lying around in a persistent directory who finds their build does not work, and they spend the first few hours inspecting their own code and setup which is only natural given the code works for the senior developer and the CI system.

Back in the dark ages a good developer would run the unit tests locally before pushing. It wasn’t too bad that it took 5 minutes because they could go and make a brew. The CI system was left to run the long test suite. When something went wrong the developer could not always run it locally because the CI suite was setup differently to their environment. Maybe because it was a big project and being a back-end developer they only have half the build stack. The only way to fix the problem was to look at the CI logs, push a change that might help and wait and repeat.

Dockerise your development environment

The first and easiest step is to create a Dockerfile for your development environment. The Dockerfile simply documents the install instructions with precise version numbers and steps. Developer setup is now as simple as install docker , docker build my-dev-env  and docker run my-dev-image  the Dockerfile is added to source control so that development environment is always in-sync with code that requires it.

To create very simple javascript development environment, save the following into  ./Dockerfile

Note that you should always use a fully qualified version for base images and dependencies. You want changes to be a conscious decision.

You can now simply mount your code and run, test and package it etc.

Not the greatest example, we have only saved a local install of Node, but we have at least got a consistent and documented version across our dev team. In the real world ALL projects are more complex than this, and with every additional dependency the cost is repaid.

Dockerise your build steps

The above example still suffers from a few problems:

  • re-mounting the workspace pollutes the host machine and allows build artefacts to bleed from one build or project to the next.
  • The build workflow is still an external process documented in a wiki or in our memory, but now each command is much nastier, with all the

Lets make a few updates to the Dockerfile

The first new step is to copy in the package.json file which defines the dependencies and install them into the container image with npm install . Notice that:

  • Only the package.json file is mounted not the parent or node_modules directories, therefore the dependencies are saved into a node_modules directory that only exists within the container. They do not pollute the source directory on the host machine and cannot spill into subsequent builds or other projects.
  • The download happens once during the build phase, so that running the container is quick because the dependencies are already present.

After the third-party dependencies are installed, copy in the source code and other artefacts. It is important to separate these steps to achieve a quick build time. Docker hashes the input of each stage of the build and reuses a cached layer if nothing has changed or rebuilds the entire layer when there is a change. Remember that the third-party dependencies are downloaded into the container image in one step, therefore if one dependency changes the whole layer is invalidated and all dependencies are re-downloaded. Most builds will only have a source code change, in which case Docker will see the hash of the package.json file is unchanged and reuse the layer containing the pre-downloaded dependencies making the build substantially faster.

The next step is to run the tests during the build phase. This has 2 benefits:

  • It guarantees that any container image produced must have passed the tests
  • It defines and documents the test execution steps in one place allowing them to be run reliably on both development and CI systems with the simple command.

Finally we package for production and define how the container will execute by default.
We can now easily run the ‘production’ version with

And whilst we are still actively hacking the code we run the development version which watches for code changes on the host quickly and dynamically re-compiling/running/testing etc as defined in the projects package.json file

Great. That covers pretty much all the problems we covered in the motivation section. But the “production” container also contains a load of development dependencies that are not required in a production image. That usually rubs people up the wrong way, especially those security conscious people who keep up feeling warm and cosy at night.

Multistage builds.

Multistage builds is a Docker feature that neatly allows us to build one container to encapsulate the build and then copy artefacts into a new container optimised for Production.

Label the first FROM  line with AS dev  and add a new section with a second  FROM

I do not claim this is a good example of a production environment but it servers the illustrative purpose. We could pick any base image for the subsequent stage a good choice here could have been nginx. Notice that the COPY  command uses the flag --from=dev  to direct it to copy from the named first stage instead of the host.

Also change the command for the dev container to default to the watch mode. This simplifies the most common execution commands.

Summary

We have written one Dockerfile defining and documenting:

  • a full development environment and steps to build with one command
  • which can be run in production mode with one command
  • which can be build and run in development mode with two commands
  • Which can be reliably executed by any new team member or CI systems in minutes.
  • Which keeps all artefacts insulated within protecting future builds and other projects.
  • Which guarantees the tests have been run on all production images

Part 2

So far we have only looked at Docker and used a very simple example node.js build process.

In part 2 (coming soon) I will describe how you can use docker-compose  to extend this twee example to cover complex multi-component projects.

:x

Read this…