Extra tips

Controlling Docker from a container

  • In a local environment, just bind-mount the Docker control socket:

    docker run -ti -v /var/run/docker.sock:/var/run/docker.sock docker
    
  • Otherwise, you have to:

    • set DOCKER_HOST,
    • set DOCKER_TLS_VERIFY and DOCKER_CERT_PATH (if you use TLS),
    • copy certificates to the container that will need API access.

More resources on this topic:

Bind-mounting the Docker control socket

  • In Swarm mode, bind-mounting the control socket gives you access to the whole cluster

  • You can tell Docker to place a given service on a manager node, using constraints:

      docker service create \
        --mount source=/var/run/docker.sock,type=bind,target=/var/run/docker.sock \
        --name autoscaler --constraint node.role==manager ...
    

Constraints and global services

(New in Docker Engine 1.13)

  • By default, global services run on all nodes

    docker service create --mode global ...
    
  • You can specify constraints for global services

  • These services will run only on the node satisfying the constraints

  • For instance, this service will run on all manager nodes:

    docker service create --mode global --constraint node.role==manager ...
    

Constraints and dynamic scheduling

(New in Docker Engine 1.13)

  • If constraints change, services are started/stopped accordingly

    (e.g., --constraint node.role==manager and nodes are promoted/demoted)

  • This is particularly useful with labels:

    docker node update node1 --label-add defcon=five
    docker service create --constraint node.labels.defcon==five ...
    docker node update node2 --label-add defcon=five
    docker node update node1 --label-rm defcon=five
    

Shortcomings of dynamic scheduling

.warning[If a service becomes "unschedulable" (constraints can't be satisfied):]

  • It won't be scheduled automatically when constraints are satisfiable again

  • You will have to update the service; you can do a no-op update with:

    docker service update ... --force
    

.warning[Docker will silently ignore attempts to remove a non-existent label or constraint]

  • It won't warn you if you typo when removing a label or constraint!

Node management

  • SwarmKit allows to change (almost?) everything on-the-fly

  • Nothing should require a global restart

Node availability

docker node update <node-name> --availability <active|pause|drain>
  • Active = schedule tasks on this node (default)

  • Pause = don't schedule new tasks on this node; existing tasks are not affected

    You can use it to troubleshoot a node without disrupting existing tasks

    It can also be used (in conjunction with labels) to reserve resources

  • Drain = don't schedule new tasks on this node; existing tasks are moved away

    This is just like crashing the node, but containers get a chance to shutdown cleanly

Managers and workers

  • Nodes can be promoted to manager with docker node promote

  • Nodes can be demoted to worker with docker node demote

  • This can also be done with docker node update <node> --role <manager|worker>

  • Reminder: this has to be done from a manager node
    (workers cannot promote themselves)

Removing nodes

  • You can leave Swarm mode with docker swarm leave

  • Nodes are drained before being removed (i.e. all tasks are rescheduled somewhere else)

  • Managers cannot leave (they have to be demoted first)

  • After leaving, a node still shows up in docker node ls (in Down state)

  • When a node is Down, you can remove it with docker node rm (from a manager node)

Join tokens and automation

  • If you have used Docker 1.12-RC: join tokens are now mandatory!

  • You cannot specify your own token (SwarmKit generates it)

  • If you need to change the token: docker swarm join-token --rotate ...

  • To automate cluster deployment:

    • have a seed node do docker swarm init if it's not already in Swarm mode

    • propagate the token to the other nodes (secure bucket, facter, ohai...)

Viewing disk usage: docker system df

(New in Docker Engine 1.13)

  • Shows disk usage for images, containers, and volumes

  • Breaks down between active and reclaimable categories

  • Check how much disk space is used at the end of the workshop:
    docker system df
    

Cleaning up disk: docker system prune

  • Removes stopped containers

  • Removes dangling images (that don't have a tag associated anymore)

  • Removes orphaned volumes

  • Removes empty networks

  • Try it:
    docker system prune -f
    

Note: docker system prune -a will also remove unused images.

Events

  • You can get a real-time stream of events with docker events

  • This will report local events and cluster events

  • Local events =
    all activity related to containers, images, plugins, volumes, networks, on this node

  • Cluster events =
    Swarm Mode activity related to services, nodes, secrets, configs, on the whole cluster

  • docker events doesn't report local events happening on other nodes

  • Events can be filtered (by type, target, labels...)

  • Events can be formatted with Go's text/template or in JSON

Getting all the events

  • There is no built-in to get a stream of all the events on all the nodes

  • This can be achieved with (for instance) the four following services working together:

    • a Redis container (used as a stateless, fan-in message queue)

    • a global service bind-mounting the Docker socket, pushing local events to the queue

    • a similar singleton service to push global events to the queue

    • a queue consumer fetching events and processing them as you please

I'm not saying that you should implement it with Shell scripts, but you totally could.

(It might or might not be one of the initiating rites of the House of Bash)

For more information about event filters and types, check the documentation.