Understanding Docker images
- Objectives
- What is an image?
- Example for a Java webapp
- The read-write layer
- Multiple containers sharing the same image
- Differences between containers and images
- Comparison with object-oriented programming
- Wait a minute...
- A chicken-and-egg problem
- Creating the first images
- Creating other images
- Images namespaces
- Root namespace
- User namespace
- Self-Hosted namespace
- How do you store and manage images?
- Showing current images
- Searching for images
- Downloading images
- Pulling an image
- Image and tags
- When to (not) use tags
- Section summary
Objectives
In this section, we will explain:
What is an image.
What is a layer.
The various image namespaces.
How to search and download images.
Image tags and when to use them.
What is an image?
Image = files + metadata
These files form the root filesystem of our container.
The metadata can indicate a number of things, e.g.:
- the author of the image
- the command to execute in the container when starting it
- environment variables to be set
- etc.
Images are made of layers, conceptually stacked on top of each other.
Each layer can add, change, and remove files and/or metadata.
Images can share layers to optimize disk usage, transfer times, and memory use.
Example for a Java webapp
Each of the following items will correspond to one layer:
- CentOS base layer
- Packages and configuration files added by our local IT
- JRE
- Tomcat
- Our application's dependencies
- Our application code and assets
- Our application configuration
The read-write layer
Multiple containers sharing the same image
Differences between containers and images
An image is a read-only filesystem.
A container is an encapsulated set of processes running in a read-write copy of that filesystem.
To optimize container boot time, copy-on-write is used instead of regular copy.
docker run
starts a container from a given image.
Comparison with object-oriented programming
Images are conceptually similar to classes.
Layers are conceptually similar to inheritance.
Containers are conceptually similar to instances.
Wait a minute...
If an image is read-only, how do we change it?
We don't.
We create a new container from that image.
Then we make changes to that container.
When we are satisfied with those changes, we transform them into a new layer.
A new image is created by stacking the new layer on top of the old image.
A chicken-and-egg problem
The only way to create an image is by "freezing" a container.
The only way to create a container is by instanciating an image.
Help!
Creating the first images
There is a special empty image called scratch
.
- It allows to build from scratch.
The docker import
command loads a tarball into Docker.
- The imported tarball becomes a standalone image.
- That new image has a single layer.
Note: you will probably never have to do this yourself.
Creating other images
docker commit
- Saves all the changes made to a container into a new layer.
- Creates a new image (effectively a copy of the container).
docker build
(used 99% of the time)
- Performs a repeatable build sequence.
- This is the preferred method!
We will explain both methods in a moment.
Images namespaces
There are three namespaces:
Official images
e.g.
ubuntu
,busybox
...User (and organizations) images
e.g.
jpetazzo/clock
Self-hosted images
e.g.
registry.example.com:5000/my-private/image
Let's explain each of them.
Root namespace
The root namespace is for official images. They are put there by Docker Inc., but they are generally authored and maintained by third parties.
Those images include:
Small, "swiss-army-knife" images like busybox.
Distro images to be used as bases for your builds, like ubuntu, fedora...
Ready-to-use components and services, like redis, postgresql...
Over 130 at this point!
User namespace
The user namespace holds images for Docker Hub users and organizations.
For example:
jpetazzo/clock
The Docker Hub user is:
jpetazzo
The image name is:
clock
Self-Hosted namespace
This namespace holds images which are not hosted on Docker Hub, but on third party registries.
They contain the hostname (or IP address), and optionally the port, of the registry server.
For example:
localhost:5000/wordpress
localhost:5000
is the host and port of the registrywordpress
is the name of the image
How do you store and manage images?
Images can be stored:
- On your Docker host.
- In a Docker registry.
You can use the Docker client to download (pull) or upload (push) images.
To be more accurate: you can use the Docker client to tell a Docker Engine to push and pull images to and from a registry.
Showing current images
Let's look at what images are on our host now.
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
fedora latest ddd5c9c1d0f2 3 days ago 204.7 MB
centos latest d0e7f81ca65c 3 days ago 196.6 MB
ubuntu latest 07c86167cdc4 4 days ago 188 MB
redis latest 4f5f397d4b7c 5 days ago 177.6 MB
postgres latest afe2b5e1859b 5 days ago 264.5 MB
alpine latest 70c557e50ed6 5 days ago 4.798 MB
debian latest f50f9524513f 6 days ago 125.1 MB
busybox latest 3240943c9ea3 2 weeks ago 1.114 MB
training/namer latest 902673acc741 9 months ago 289.3 MB
jpetazzo/clock latest 12068b93616f 12 months ago 2.433 MB
Searching for images
We cannot list all images on a remote registry, but we can search for a specific keyword:
$ docker search marathon
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
mesosphere/marathon A cluster-wide init and co... 105 [OK]
mesoscloud/marathon Marathon 31 [OK]
mesosphere/marathon-lb Script to update haproxy b... 22 [OK]
tobilg/mongodb-marathon A Docker image to start a ... 4 [OK]
"Stars" indicate the popularity of the image.
"Official" images are those in the root namespace.
"Automated" images are built automatically by the Docker Hub.
(This means that their build recipe is always available.)
Downloading images
There are two ways to download images.
Explicitly, with
docker pull
.Implicitly, when executing
docker run
and the image is not found locally.
Pulling an image
$ docker pull debian:jessie
Pulling repository debian
b164861940b8: Download complete
b164861940b8: Pulling image (jessie) from debian
d1881793a057: Download complete
As seen previously, images are made up of layers.
Docker has downloaded all the necessary layers.
In this example,
:jessie
indicates which exact version of Debian we would like.It is a version tag.
Image and tags
Images can have tags.
Tags define image versions or variants.
docker pull ubuntu
will refer toubuntu:latest
.The
:latest
tag is generally updated often.
When to (not) use tags
Don't specify tags:
- When doing rapid testing and prototyping.
- When experimenting.
- When you want the latest version.
Do specify tags:
- When recording a procedure into a script.
- When going to production.
- To ensure that the same version will be used everywhere.
- To ensure repeatability later.
Section summary
We've learned how to:
- Understand images and layers.
- Understand Docker image namespacing.
- Search and download images.