Docker images consist of multiple layers that collectively provide the content you see in your containers. But what actually is a layer, and how does it differ from a complete image?
In this article you’ll learn how to distinguish these two concepts and why the difference matters. While you can use Docker without a thorough understanding of layers, having an awareness of their purpose will help you identify optimization opportunities.
What’s an Image?
A Docker “image” behaves like a template from which consistent containers can be created. If Docker was a traditional virtual machine, the image could be likened to the ISO used to install your VM. This isn’t a robust comparison, as Docker differs from VMs in terms of both concept and implementation, but it’s a useful starting point nonetheless.
Images define the initial filesystem state of new containers. They bundle your application’s source code and its dependencies into a self-contained package that’s ready to use with a container runtime. Within the image, filesystem content is represented as multiple independent layers.
What are Layers?
Layers are a result of the way Docker images are built. Each step in a Dockerfile creates a new “layer” that’s essentially a diff of the filesystem changes since the last step. Metadata instructions such as
MAINTAINER do not create layers because they don’t affect the filesystem.
This image has two instructions (
RUN) so it’ll create two layers:
FROM ubuntu:latest COPY foo.txt /foo.txt RUN date > /built-on.txt
- The first step copies
foo.txtinto a new layer that’s based on the
- The second step runs the
datecommand and pipes its output into a file. This creates a second layer that’s based on the previous one.
foo.txt in your working directory:
$ echo "Hello World" > foo.txt
Now build the sample image:
$ docker build . -t demo:latest Sending build context to Docker daemon 2.56kB Step 1/3 : FROM ubuntu:latest ---> df5de72bdb3b Step 2/3 : COPY foo.txt /foo.txt ---> 4932aede6a15 Step 3/3 : RUN date > /built-on.txt ---> Running in 91d260fc2e68 Removing intermediate container 91d260fc2e68 ---> 6f653c6a60fa Successfully built 6f653c6a60fa Successfully tagged foo:latest
Each build step emits the ID of the created layer. The last step’s layer becomes the final image so it gets tagged with
The sequence reveals that layers are valid Docker images. Although the term “layer” isn’t normally used to refer to a tagged image, all tagged images are technically just layers with an identifier assigned.
You can start a container from an intermediate layer’s image:
$ docker run -it 4932aede6a15 sh # cat /foo.txt Hello World # cat /built-on.txt cat: /built-on.txt: No such file or directory
This example starts a container from the layer created by the second build step.
foo.txt is available in the container but
built-on.txt doesn’t exist because it’s not added until the third step. That file’s only available in the filesystems of subsequent layers.
The Role of Layers
Layers contain the changes created by a build step, relative to the previous layer in the Dockerfile.
FROM instructions are a special case that reference the final layer of an existing image.
Layers allow build steps to be cached to avoid redundant work. Docker can skip unchanged instructions in your Dockerfile by reusing the previously created layer. It bases the next step on that existing layer, instead of building a new one.
You can see this by modifying your Dockerfile as follows:
FROM ubuntu:latest COPY foo.txt /foo.txt RUN date +%Y-%m-%d > /built-on.txt
The third build step has changed. Now rebuild your image:
$ docker build . -t demo:latest Sending build context to Docker daemon 3.584kB Step 1/3 : FROM ubuntu:latest ---> df5de72bdb3b Step 2/3 : COPY foo.txt /foo.txt ---> Using cache ---> 4932aede6a15 Step 3/3 : RUN date +%Y-%m-%d > /built-on.txt ---> Running in 2b91ec0462c4 Removing intermediate container 2b91ec0462c4 ---> c6647ff378c1 Successfully built c6647ff378c1 Successfully tagged demo:latest
The second build step shows as
Using cache and produces the same layer ID. Docker could skip building this layer as it was already created earlier and
foo.txt hasn’t changed since the first build.
This caching only works up to the point a layer is modified. All the steps after that layer will need to be rebuilt too so they’re based on the new filesystem revision.
Layers and Pull Operations
Another benefit of layers is how they enable partial image pulls. Once you’ve downloaded a few images to your machine, you’ll often find new pulls can skip some layers that you already have. This image contains 13 layers but only six had to be downloaded by the pull operation:
docker pull php:8.0-apache 8.0-apache: Pulling from library/php 7a6db449b51b: Already exists ad2afdb99a9d: Already exists dbc5aa907229: Already exists 82f252ab4ad1: Already exists bf5b34fc9894: Already exists 6161651d3d95: Already exists cf2adf296ef1: Already exists f0d7c5221e44: Pull complete f647198f6316: Pull complete c37afe1da4e5: Pull complete 09c93531cbca: Pull complete fef371007dd3: Pull complete 52043dbb1c06: Pull complete Digest: sha256:429889e8f9eac0a806a005b0728a004303b0d49d77b09496d39158707abd6280 Status: Downloaded newer image for php:8.0-apache docker.io/library/php:8.0-apache
The other layers were already present on the Docker host so they could be reused. This improves performance and avoids wasting network bandwidth.
Inspecting Image Layers
You can list the layers within an image by running the
docker image history command. Each layer displays the ID of the created image and the Dockerfile instruction that caused the change. You can see the total size of the content within the layer too.
$ docker image history IMAGE CREATED CREATED BY SIZE COMMENT 6f653c6a60fa 4 minutes ago /bin/sh -c date > /built-on.txt 29B f8420d1a96f3 4 minutes ago /bin/sh -c #(nop) COPY file:a5630a7506b26a37... 0B df5de72bdb3b 4 weeks ago /bin/sh -c #(nop) CMD ["bash"] 0B <missing> 4 weeks ago /bin/sh -c #(nop) ADD file:396eeb65c8d737180... 77.8MB
The last layer displays as
<missing> because it refers to a layer within the
ubuntu:latest base image. This is not available locally, as only the final layer of the base image (
df5de72bdb3b) gets pulled down during builds. There’s no need to independently pull all the intermediate layers when you want to use a specific image.
Docker images and layers are generally interchangeable terms. A layer is an image and an image is formed from one or more layers. The major difference lies in tags: an image will be tagged and designed for end users, while the term “layer” normally refers to the untagged intermediate images created as part of a build operation. These aren’t visible unless you go looking for them.
There’s one more topic that relates to layers: running containers add an extra writable layer on top of their image. Layers sourced from the container’s image are read-only so filesystem modifications made by the container target its ephemeral writable layer. The writable layer gets discarded when the container’s stopped or deleted.