Sharing cached layer between docker and docker-compose builds

Photo by Sebastian Herrmann on Unsplash

This post is second part of our exploration of Docker caching — if you haven’t already, check out the first part of the story, where we introduced docker layers and caching mechanism. Let’s not have a look at docker-compose and some challenges when using both docker and docker-compose at the same time.

Docker Compose is a great dev tool

We used docker build and docker run commands to build/run our application from docker image locally. If we had a project with multiple components (e.g. client and server), building a running each can quickly become cumbersome and inefficient. Luckily, there is Docker Compose — a CLI tool designed for running multi-container applications.

Docker compose is completely separate from docker, but uses docker engine internally to orchestrate building and running multiple containers. For the overview of the Compose and installation instructions, head to the official website: https://docs.docker.com/compose/.

While each component will still have its own Dockerfile, describing how the image should be built, we now also have a docker-compose.yml file — an YAML descriptor of all containers that need to be run together and it’s runtime properties.

Note: for more in-depth details of using docker compose during development, check the second part of Eric’s excellent blog on that subject: https://medium.com/tsftech/how-to-fully-utilise-docker-compose-during-development-4b723caed798

Let’s have a look how would a docker-compose.yml look for our example (which is fairly simple as we only have one component):

version: "3.7"
services:
  hello-world-react-docker:
    build:
      context: ./
      stdin_open: true
    environment:
      - LOGLEVEL=debug
    ports:
      - 3000:3000

We simply define a service name hello-world-react-docker, set the build context to the directory which contains Dockerfile and expose the relevant ports as we did when using docker run to start the container

NOTE: stdin_open: true is a workaround for an open bug with react-scripts 3.4.1 at the time of writing (https://github.com/facebook/create-react-app/issues/8688) where react’s npm start command exits with status code 0 as soon as development server is up.

Now, running docker-compose up — build will both build the relevant image(s) and start them, so going to the browser, you’ll see the familiar React homepage again — hurray, we’re running a react app from the docker image using docker compose!

Problem: cache sharing between docker and docker-compose builds

So we have a sample app, we can package it as a docker image and run anywhere. We can use standard docker commands (build and run) or use a more developer friendly docker-compose to build and test our application locally.

However, you may have noticed a small problem — although we had build a docker image using docker build command first (and cached all relevant layers), building with docker-compose resulted in rebuilding of the entire image (so we had to wait for npm install to finish for a few minutes). Running subsequent builds using docker compose, caching behaved as expected, with quick build cycle.

Look at the output below — although the image have been built with docker previously, building with docker compose does not use the cached layer and rebuilds everything (as we can see, the slow npm install command has been running)

So docker and docker-compose individually behave as expected, reusing the cached layers from the previous build. But when using both tools and switching between them, the caching does not seem to work

Why is this a problem? Well let’s imagine you have a project that contains a number of containerised components, all taking some time to build from scratch. You are working on a single component, but other developers in your team constantly push change to the others. Every commit are built on the CI server, so you can download

Why is that?

Well, the reason is buried deep into docker-compose codebase. Docker compose uses docker python client library to interact with docker engine — while docker commands to that natively — the different implementations result in different image ids for images build with either tool, which have the effect on cached layer described above

The issue is reported and discussed on few tickets in docker and docker-compose projects:

https://github.com/docker/compose/issues/5873
https://github.com/docker/compose/issues/883

Solution: Enter BuildKit

While there is no agreement how to fix this for the current docker engine version, there is a simple solution which uses still experimental (but soon mainstream) BuildKit docker engine (https://docs.docker.com/develop/develop-images/build_enhancements/). BuildKit brings long awaited new architecture and refactoring of the docker engine, which should result is many improvements in performance, storage management and security — including adding cache consistency between docker and docker-compose image builds.

We expect BuildKit to become the default in the next versions of docker, but for now, while it’s still experimental, it can be enabled by simply setting DOCKER_BUILDKIT=1 environment variable.

Note: At the time of writing BuildKit is only available for Linux containers

Let’s rebuild our app using docker build command:

You’ll see a much simpler, inline output of the build process, characteristic of BuildKit, along with timings for each layer — as before npm install command takes the most of the time. You’ll notice that the entire image is built from scratch — BuildKit has a different layer storage strategy, so layers built with legacy engines cannot be reused as cache.

Repeating the command will result in much faster build, as all layers will be cached and reused without rebuilding. You can see CACHED output in front of build step that used layer caching in the snippet below:

What about docker compose. Previously, we’ve seen that docker compose didn’t reuse cached layers built using docker build command — let’s check how this has changed.

To enable BuildKit for docker compose, it needs an additional environment variable, COMPOSE_DOCKER_CLI_BUILD=1. Let’s rebuild the app using docker-compose:

Whoo-hoo! The build was lightning fast, as all layers have been reused, even though this was the first docker-compose build of the image. The layers build using docker build command have been reused from cache, as seen by CACHE tag in the output above.

In this blog post, I tried to shed some light on docker caching and gotchas relating to using docker and docker-compose tools during development. Remember, whatever engine or tool you use, the key is to structure your Dockerfile descriptors so they are layer-caching-aware — ensure to copy/run most frequently changed files or commands toward the bottom of the Dockerfile, by using multiple COPY, ADD and RUN commands depending on the lifecycle of the referenced components!

Aleksa is a CTO and TheStartupFactory.tech, where he helps to build the foundations of tech startups and growing their young and ambitious teams.

We’re ready to talk...

Wherever you are on your startup journey, get in touch and let’s unpack your thinking together and see where we can help turn your idea into a reality.