How to fully utilise Docker-compose during development

TL;DR

Use multiple docker-compose files to define variants of your complex, multi-component, production environment. To create similar environments optimised for efficient development or CI.

Part 1 (https://thestartupfactory.tech/journal/how-to-fully-utilise-docker-during-development)

  • Use Docker to manage, document and share your full development environment.
  • Use Docker layers to optimise the build process.
  • Use Docker’s multi-stage builds to build a production deployable component from source.

Part 2

  • Use Docker-compose to host/emulate your Production system.
  • Use a separate Docker-compose file to maintain your development/CI implementation.
  • Use Docker-compose to test your production components.

Problems / Motivation

In part 1 we built a development environment and CI tasks for a simple Node project in a single Dockerfile. This helped us solve a number of issues around documenting, sharing and maintaining development environments within a dev team. But does not scale well to real-world multi-component projects.

Why not?

Docker containers are built like onions, each run command in the Dockerfile adds a layer. The Dockerfile specification is therefore sequential in nature, change the order of the commands and you have a different container image. While easy to follow and repeatable the sequential Dockerfile does not do a good job of communicating intent. For example if we are installing two components one will have to go first. How does the reader know if this was an arbitrary decision that can be changed or an explicit ordering to fulfil a dependency which can not be changed?

One of the fundamental benefits of Docker is reuse. The inner core layers can be shared between multiple images locally, and can be pushed and tagged in public repositories allowing your team to use them without the investment and maintenance costs. If you run your entire development stack in a single container, what are the chances you will find an off-the-shelf image in a public registry with your projects choice of database, backend language, frontend language and whatever other key dependencies you use, all wrapped up together, with a bow on top. It is nearly Christmas but I don’t think Santa and his elves have been beavering away for you.

Microservices. You have probably heard of them. To summarise it’s an architecture for software systems where functionality is separated into independently deployable and runnable components that communicate with each other usually over an IP based fabric with message queues, HTTP or RPC. This allows you to do things like patch a single component without full system downtime and complexity, and to add multiple instances of heavily used or underperforming components without wasting resources by duplicating underused components. Put simply you cannot do this with a single mega container.

Docker-compose your ‘Production’ system

Our example complex system is made of a number of components and some communications channels. Docker-compose is probably the easiest, cleanest way to configure this component integration. Just read through this compose file and see if you can understand what the overall system looks like.

version: "3.7"

services:

  proxy:
    image: nginx:1.15.5
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
    ports:
      - 80:80

  web:
    build:
      context: ./web
    depends_on:
      - server
    environment:
      - LOGLEVEL=warn
      - PORT=3000

  server:
    build:
      context: ./server
    environment:
      - LOGLEVEL=warn
      - PORT=3000
    depends_on:
      - db
      - processor

  processor:
    build:
      context: ./processor
    environment:
      - LOGLEVEL=warn
      - PORT=3000

  db:
    image: mongo:3.4.6

It defines a system with 5 components

  • Proxy – which provides a public interface, load balancing etc
  • Web – which serves the webpage resources, html, javascript, etc to users browsers
  • Server – providing data APIs, business logic, database integration etc
  • Processor – which handles a lot of high CPU/memory activities orchestrated by the server, such as image processing
  • DB – our central persistent data store

The Proxy and DB services use off-the-shelf images, while Web, Server and Processor use our own Dockerfiles which are organised into respectively named sub-directories with the components code.

.
├── README.md
├── docker-compose.yml
├── nginx.conf
├── processor
│   ├── Dockerfile
│   └── src
├── server
│   ├── Dockerfile
│   ├── src
│   └── tests
└── web
    ├── Dockerfile
    ├── package-lock.json
    ├── package.json
    ├── public
    └── src

In the compose file we use:

  • environment – to pass a list of environment variable to each component
  • volumes – mount files or directories on the container to add configuration or source code
  • depends_on – to document dependencies and provide a start up order
  • ports – to publish network ports to the host and thereby to external users
  • build – to show where the container image is defined

We can now start our whole application with one command

docker-compose up

Docker-compose will:

  • Download or Build images for each container if there required i.e. none already available locally
  • Starting the containers in the order required to match the dependencies
  • Create a virtual network connecting all the containers together. Each container is assigned a DNS entry matching the service name allowing each container to communicate with each other on any port and we don’t need to know what IP address has been assigned. Notice how both the server and web components were configured to use port 3000, there is no collision because each container is a different host in the network i.e server:3000 and web:3000.
  • Configure firewall/port-forwarding so that only the published ports are externally accessible

In the sub-title ‘Production’ is quoted for a reason. You do not need to use Docker in production to reap the benefits in development. If you do use Docker in production you will probably need to run it slightly differently on your local machine. For example hosting a database vs using a cloud DB provider and running single instances instead of scaling groups. The aim is to build a close approximation to improve the validity of testing activities and encourage a decoupled architecture.

We now have our whole stack running on our local machine in easy to manage containers and no local dependencies save for Docker and Docker-compose themselves. This could be huge for your team. I have seen teams with specialised FE/BE/DB/Test members become fearful of change and create additional barriers and bureaucracy because of bad experiences installing or updating components from areas and technologies they are not familiar with. No more. Teams maintaining clunky cracking monolithic applications can struggle to start making improvements because of the infrastructure and procedural overheads. No more. The ability to spin up and integrate new components on your own laptop significantly reducing the barrier to entry for refactoring monolithic applications.

Separation of concerns: Production vs Development vs CI

Great work so far, we have a representative stack up and running on our dev machine. But in production we will be using a database-as-a-service provider so we don’t want the database component. We also want to be able to do real time development with live reloading locally but run pre-compiled optimised components in production. And finally we want to include test runners and mock components so that all of our supporting infrastructure is consistently defined and easy to use.

In part 1 we saw how to use multi-stage builds to create production and development variants of a single component. How do we setup our compose file to allow us to use either of the component variants? We could create duplicate components with different properties in our existing compose file. Or, we could manage duplicate compose files with differing properties and services. But both these approaches violate the good old Separation-of-concerns and Don’t-repeat-yourself principles. Fortunately the guys and gals of docker-compose have got our backs.

Docker-compose allows you to use an ordered list of compose files to overwrite and extend your container definitions. You can specify the compose file list with arguments to docker-compose or with an environment variable.

# Using environment variable method
export COMPOSE_FILE=docker-compose.yml:docker-compose.dev.yml
docker-compose up

# Using argument variable method
docker-compose -f docker-compose.yml -f docker-compose.dev.yml up

We are going to split our configuration over 3 compose files each focused on a specific aspect of our requirements:
– docker-compose.yml – to define all and only the production environment components
– docker-compose.dev.yml – to tweak the environment to make it efficient for development activities
– docker-compose.ci.yml – to extend the environment with components to add and run end-to-end tests

Compose for Production (version 2)

#docker-compose.yml
version: "3.7"

services:

  proxy:
    image: nginx:1.15.5
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
    ports:
      - 80:80

  web:
    build:
      context: ./web
    depends_on:
      - server
    environment:
      - LOGLEVEL=warn
      - PORT=3000

  server:
    build:
      context: ./server
    environment:
      - LOGLEVEL=warn
      - PORT=3000
    depends_on:
      - processor

  processor:
    build:
      context: ./processor
    environment:
      - LOGLEVEL=warn
      - PORT=3000

This file is basically unchanged except that the database component and related depends_on entries have been removed.

Compose for Development

#docker-compose.dev.yml
version: '3.7'

services:

  web:
    build:
      target: dev
    volumes:
      - ./web/src:/home/app/src
      - ./web/public:/home/app/public

  server:
    build:
      target: dev
    environment:
      - LOGLEVEL=debug
    volumes:
      - ./server/src:/home/app/src
    depends_on:
      - db

  db:
    image: mongo:3.4.6

The first thing to note is that we have added the ‘db’ component back in and updated the server component to add the dependency, but we have not fully redefined the server component only the additional and changed parameters.
I have also reduced the log level of the server component by overwriting the ‘LOGLEVEL’ environment variable.
I have added the ‘target’ property to the web and server component build properties. In part 1 we covered multi-stage builds in docker, this property tell docker to use the image from the stage labeled ‘dev’. The dev stage image runs the source code directly with dynamic reloading turned on. This makes the development cycle much faster as we don’t need to rebuild and restart containers to make edits during development.
It would not be much use dynamically reloading the source code that is statically copied into the container, so we also add the ‘volumes’ properties to mount the current source code from the host machine over the top of the source code copied into the container image. Neither of these are things we would want to do in our production docker-compose.

We can now easily run our application in production like mode (using the fixed tested images)

docker-compose -f docker-compose.yml up

or in development mode (with mounted source code and dynamic reloading etc)

# Using environment variable method
# export COMPOSE_FILE=docker-compose.yml:docker-compose.dev.yml
docker-compose up

or

# Using argument variable method
docker-compose -f docker-compose.yml -f docker-compose.dev.yml up

Compose for CI

In part 1 we included running test suites in the container build. These test suits should be restricted to testing within that component, unit tests and component tests. Now that we have a multi component system, so we need to check that all the components of the system work together, to prove that both the docker-compose wiring is correct and that all the components have consistently implemented their interfaces.

There are lots of different applications out there in the big wide world and lots of frameworks and test systems to accompany them. Thats not really the focus of this blog but it may feel hard to separate the test tech from the pattern. In this example I will use Selenium, probably the most widely used test tech for testing web apps in browsers. But for any project the test runners, harnesses, mocks and instrumentation should be separate from the application code/component under test so a similar pattern should work for you too.

version: '3.7'
services:
  selenium-chrome:
    image: selenium/node-chrome-debug:3.14.0-francium
    ports:
    - 5900:5900
    depends_on:
    - selenium-hub
    environment:
    - HUB_HOST=selenium-hub
    - HUB_PORT=4444
  selenium-hub:
    image: selenium/hub:3.14.0-francium
    ports:
    - 4444:4444
  test-runner:
    build:
      context: .
    volumes:
    - ./e2e_test/src:/home/app/src
    environment:
    - SELENIUM_HUB_URL=http://selenium-hub:4444/wd/hub
    - TEST_SUBJECT_URL=http://nginx/
    depends_on:
    - chrome

This compose file describes a 3 component tests system

  • selenium-chrome – is an off-the-shelf web browser in a container that is configured to connect to a selenium hub
  • selenium-hub – is an intelligent proxy for running selenium tests, it allows us to load share tests over multiple similar nodes and manage multi browser testing over multiple different nodes
  • test-runner – as the name suggests runs the tests, the 2 environment variables tell it
    • to use the hub container to run the tests
    • the web address of the application we want to test

Only the test-runner component has bespoke code, and this code is entirely independent of the application code. Given that part 1 showed how easy it is to manage dev dependencies and the e2e code is independent, only interacting through web or other generic interfaces, you are free to choose any language and framework to write your tests in. Java-selenium/JS-webdriverio/Groovy-GEB etc. This can be a great benefit if you have an independent test team with their own skill sets, or if you are using a language with less off-the-shelf test infrastructure available.
The TEST_SUBJECT_URL also makes it trivial to run the same tests against different environments (including production if appropriate) or to use this test setup without Dockerising the application or development environments.
Adding a second or third browser component to load share tests or duplicate tests in a different browser, is as simple as adding a new service to the compose file and setting the number of executors or browsers list in the test runner config.
Now that you have a full production and test stack in docker, both dev and test teams can easily develop and debug issues on their local machines and have the confidence that the behaviour will be the same as in the CI environment. Gone are the days of pushing hundreds of commits to trial fixes to the CI system on a fragile server.

Run the test suite against the local production environment

docker-compose -f docker-compose.yml -f docker-compose.ci.yml test-runner

Run the test suite against the uncommitted development code

docker-compose -f docker-compose.yml -f docker-compose.dev.yml -f docker-compose.ci.yml test-runner

Run the test suite against the deployed live production environment

docker-compose -f docker-compose.yml -f docker-compose.ci.yml -e TEST_SUBJECT_URL=https://mylivesite.com test-runner

Extra notes

Compose logs

One of the simple but sweet benefits of having your dev stack in docker-compose is that you can see the logs of each component interleaved and labeled by docker in one place. No need to set up a log aggregator or tmux into multiple terminals.
Follow all logs of all component services

docker-compose up

or

docker-compose up -d
docker-compose logs -f

Follow logs for one or more component services

docker-compose logs -f server, processor

Notice that the ‘up’ command follows all logs by default. Generally I don’t find this very practical because I often want to stop following, do something else, and then come back. But ctrl-c will (sometimes) terminate the ‘up’ and stop the containers.
It is for more common to start the containers in the background with ‘up -d’ and then attach to the logs explicitly.

Adding dependencies

If you can remember back to part 1, we build a multi-layered docker image with third-party dependencies fixed in one layer then our latest code mounted on top of the image and dynamically rebuilt in the dev container.
Given that you are always mid development you will need to update your dependencies from time to time. Stopping and rebuilding the container is one option but it is slow because it rebuilds everything from the dependencies layer upwards. Instead you can use Docker’s exec command to run a command on a running container and update the dependencies in place. Add the new dependency to your package.json file then

docker-compose exec server npm install

The only gotcha is that if you delete the container and restart without rebuilding the image then the new dependencies will be missing. So remember to rebuild your image after adding dependencies.

Recap

In part 1 we saw how to create a Dockerfile for a single component:

  • With multiple targets configured for development and production needs
  • With docker image layers used to cache dependencies
  • That run tests during build to “guarantee” the image is correct

In part 2 (this article) we saw how to create a more realistic multi-component application:

  • Using docker-compose files to define and document the component integration
  • How to keep Development environment specific integration separate from that of the Production environment
  • How to share common definitions across Production and Development environments
  • How to mount source code and tell docker to use specific stage image to enable efficient development experience
  • How to build extendable test infrastructure that runs anywhere.

😡

We’re ready to talk...

Wherever you are on your startup journey, get in touch and let’s unpack your thinking together and see where we can help turn your idea into a reality.