Steps to Build and Debug a Dockerfile

This past week I had to containerise an application. I ended up going the containerisation route as:

The apt-get package for this application did not work properly
The application required specific versions of core system software
- I did not want to make the changes to my system in case I broke something in the process
The setup process for this application's dependencies was fairly involved and as a result cumbersome to replicate on my colleagues' machines

My first step was to see if I'd get lucky by finding an existing dockerfile for this online. After a quick search I found one. Unfortunately this file was using an old none LTS version of Ubuntu which no longer worked when I ran a docker build against it. As a result I ended up having to basically recreate the entire file.

My process to do this was:

Work out the dependencies I need for my application
For each dependency create it as a separate docker multistage build
Chain these stages together i.e. refer to the previous stage
- Breaking this process up into multiple stages makes the file easier to read and maintain as well as debug as you can see exactly what stage it failed on
Combine it all together on the last stage
- you should have access to the data from the previous stage as we chained them together

Gotchas and tips

Gotcha: Each RUN in a dockerfile runs in a separate container

In your dockerfile when you use the RUN command to run a Linux command in the container, each call to RUN is done in a separate container.

This means that the state of the previous command will not be carried through to the next commands.

For example the below is the incorrect way of doing this:

...
RUN firstCommandsYouRun && cd some/path
# the below run is not executing in some/path as it is in a separate RUN and thus a separate container
RUN  echo "Starting the complicated build steps now" && \
... The rest of the chained command where the docker build is breaking

To fix this we update as follows:

...
# note the added && \
RUN firstCommandsYouRun && cd some/path && \
    echo "Starting the complicated build steps now" && \
... The rest of the chained command where the docker build is breaking

This RUN is now in the same layer and all commands are in the same context. The main caveat to this is that when this RUN step breaks it is hard to debug as you get an error for the entire chained command. I describe how to debug this in the following sections. You should also break your RUNs out into logical parts instead of one monolithic RUN which (despite commands within RUNs still being chained together) are still in logical separations.

Tip: debugging container build issues

Tip: Take advantage of layer caching to speed up debugging

In my earlier gotcha I mentioned how multi commands should be chained together to reduce the number of layers your container needs and to share state between related commands.

The exception to this is when you are debugging. If the step you are on starts off by say git cloning, downloading or installing a number of packages then place this in a separate RUN as this slows down the debugging process (you need to wait for a download each run). This will result in that run being hit once, then hitting the other split which is now in a new RUN. This means that when you run a docker build again the time consuming RUN will not be hit again - only the new RUN which has the part that is failing is executed again!

Below is an example of what this looks like before this optimization:

...
RUN wget http://some.huge.binary && \
    echo "Starting the complicated build steps now" && \
... The rest of the chained command where the docker build is breaking

To optimize this for debugging purposes I updated the above as follows:

...
RUN wget http://some.huge.binary
# Notice that we split this out
# You may need to cd to the correct location as the other run is in a separate container execution
RUN  echo "Starting the complicated build steps now" && \
... The rest of the chained command where the docker build is breaking

Now you can keep fiddling with the split out RUN until you fix the issue. Basically keep changing the split out part running docker build yourImageName:versionNum . until you fix the issue then recombine the split out RUNs back into one RUN using && \ to chain the commands together.

Tip: Comment out the failing dockerfile section until the end of the file

Doing this then running the container let's you run the partial container that is working and then run the breaking commands within the container by hand to better debug why certain steps are failing.

If there are many steps that are breaking then put these in a file in your projects root called for example scripts.sh and copy that into the container. You can now easily run these commands in the running container and debug.

For example I have the following:

FROM ubuntu:18.04
...

# install a few tools on the container to make debugging on it easier
RUN apt-get install vim wget screen
...
# copy a scripts.sh which contains the breaking part of the build
COPY scripts.sh /root/scripts.sh

...
RUN commandsThatAreFine
RUN commandsThatAreBreaking
...
EXPOSE 6701
CMD ["commandToStartApp"]

Simply comment out from the breaking part downwards:

FROM ubuntu:18.04
...

# install a few tools on the container to make debugging on it easier
RUN apt-get install vim wget screen
...
# copy a scripts.sh which contains the breaking part of the build
COPY scripts.sh /root/scripts.sh

...
RUN commandsThatAreFine
#RUN commandsThatAreBreaking
#...
#EXPOSE 6701
#CMD ["commandToStartApp"]

Now build the container, run it and shell in:

docker build yourImageName:versionNum .
docker run -d --name yourImageName:versionNum
docker exec -it yourImageName bash

Assuming all the above commands completed successfully you should now be inside the running container. You can now run whatever commands failed and further debug what is going on.

Once you work out the issue uncomment the broken sections, fix them based on your findings and try rebuild the image. Keep repeating this process until you have a fully working image.

Tip: Add new RUN sections as far down in your dockerfile as possible

In Docker if you have multiple RUNs one after another and change the one in the middle then the ones after the middle RUN have to be re-executed and will not be able to use the cached layers from when you ran a docker build previously. For example we have:

...
RUN new-commands-you-just-added
# the below runs were executed in a previous docker build but will be rerun
RUN wget http://some.huge.binary
# Notice that we split this out
# You may need to cd to the correct location as the other run is in a separate container execution
RUN  echo "Starting the complicated build steps now" && \
... The rest of the chained command where the docker build is breaking

In the above we added our new commands before commands we have already run. This makes sense only if the new command is needed for the existing command. But if the new command can be run after existing RUN commands then move it to after them so that you can take advantage of caching from the previous docker builds. To fix the above we would change it as follows:

...
RUN wget http://some.huge.binary
# Notice that we split this out
# You may need to cd to the correct location as the other run is in a separate container execution
RUN  echo "Starting the complicated build steps now" && \
... The rest of the chained command where the docker build is breaking
RUN new-commands-you-just-added
# Your CMD would still have to go after this RUN
# ...