Troubleshooting overlay networks

Troubleshooting overlay networks
Finding the real cause of the bottleneck
Breaking into an overlay network
Entering the debug container
Labels
Installing our debugging tools
Investigating the rng service
Investigating the VIP
What if I don't like VIPs?
Looking up VIP backends
Testing and benchmarking our service
Benchmarking rng
Benchmark results for rng
Benchmarking hasher
Benchmarking hasher
Benchmark results for hasher
Why does everything take (at least) 100ms?
Why did we sprinkle the code with sleeps?
Why do rng and hasher behave differently?
Global scheduling → global debugging
More about overlay networks

Troubleshooting overlay networks

We want to run tools like ab or httping on the internal network

Ah, if only we had created our overlay network with the --attachable flag ...

Oh well, let's use this as an excuse to introduce New Ways To Do Things

Breaking into an overlay network

We will create a dummy placeholder service on our network
Then we will use docker exec to run more processes in this container

Start a "do nothing" container using our favorite Swiss-Army distro:

  docker service create --network dockercoins_default --name debug \
         --constraint node.hostname==$HOSTNAME alpine sleep 1000000000

The constraint makes sure that the container will be created on the local node.

Entering the debug container

Once our container is started (which should be really fast because the alpine image is small), we can enter it (from any node)

Locate the container:
```
docker ps
```
Enter it:
```
docker exec -ti containerID sh
```

Labels

We can also be fancy and find the ID of the container automatically
SwarmKit places labels on containers

Get the ID of the container:

CID=$(docker ps -q --filter label=com.docker.swarm.service.name=debug)

And enter the container:
```
docker exec -ti $CID sh
```

Installing our debugging tools

Ideally, you would author your own image, with all your favorite tools, and use it instead of the base alpine image
But we can also dynamically install whatever we need

Install a few tools:

apk add --update curl apache2-utils drill

Investigating the `rng` service

First, let's check what rng resolves to

Use drill or nslookup to resolve rng:
```
drill rng
```

This give us one IP address. It is not the IP address of a container. It is a virtual IP address (VIP) for the rng service.

Investigating the VIP

Try to ping the VIP:
```
ping -c 3 rng
```

It should ping. (But this might change in the future.)

With Engine 1.12: VIPs respond to ping if a backend is available on the same machine.

With Engine 1.13: VIPs respond to ping if a backend is available anywhere.

(Again: this might change in the future.)

What if I don't like VIPs?

Services can be published using two modes: VIP and DNSRR.
With VIP, you get a virtual IP for the service, and a load balancer based on IPVS

(By the way, IPVS is totally awesome and if you want to learn more about it in the context of containers, I highly recommend this talk by @kobolog at DC15EU!)
With DNSRR, you get the former behavior (from Engine 1.11), where resolving the service yields the IP addresses of all the containers for this service
You change this with docker service create --endpoint-mode [VIP|DNSRR]

Looking up VIP backends

You can also resolve a special name: tasks.<name>
It will give you the IP addresses of the containers for a given service

Obtain the IP addresses of the containers for the rng service:
```
drill tasks.rng
```

This should list 5 IP addresses.

Testing and benchmarking our service

We will check that the service is up with rng, then benchmark it with ab

Make a test request to the service:
```
curl rng
```
Open another window, and stop the workers, to test in isolation:
```
docker service update dockercoins_worker --replicas 0
```

Wait until the workers are stopped (check with docker service ls) before continuing.

Benchmarking `rng`

We will send 50 requests, but with various levels of concurrency.

Send 50 requests, with a single sequential client:
```
ab -c 1 -n 50 http://rng/10
```
Send 50 requests, with fifty parallel clients:
```
ab -c 50 -n 50 http://rng/10
```

Benchmark results for `rng`

When serving requests sequentially, they each take 100ms
In the parallel scenario, the latency increased dramatically:
What about hasher?

Benchmarking `hasher`

We will do the same tests for hasher.

The command is slightly more complex, since we need to post random data.

First, we need to put the POST payload in a temporary file.

Generate 10 bytes of random data:
```
curl http://rng/10 >/tmp/random
```

Benchmarking `hasher`

Once again, we will send 50 requests, with different levels of concurrency.

Send 50 requests with a sequential client:

  ab -c 1 -n 50 -T application/octet-stream -p /tmp/random http://hasher/

Send 50 requests with 50 parallel clients:

  ab -c 50 -n 50 -T application/octet-stream -p /tmp/random http://hasher/

Benchmark results for `hasher`

The sequential benchmarks takes ~5 seconds to complete
The parallel benchmark takes less than 1 second to complete
In both cases, each request takes a bit more than 100ms to complete
Requests are a bit slower in the parallel benchmark
It looks like hasher is better equipped to deal with concurrency than rng

Why?

Why does everything take (at least) 100ms?

rng code:

hasher code:

But ...

WHY?!?

Why did we sprinkle the code with sleeps?

Deterministic performance
(regardless of instance speed, CPUs, I/O...)
Actual code sleeps all the time anyway
When your code makes a remote API call:
- it sends a request;
- it sleeps until it gets the response;
- it processes the response.

Why do `rng` and `hasher` behave differently?

(Synchronous vs. asynchronous event processing)

Global scheduling → global debugging

Traditional approach:
- log into a node
- install our Swiss Army Knife (if necessary)
- troubleshoot things
Proposed alternative:
- put our Swiss Army Knife in a container (e.g. nicolaka/netshoot)
- run tests from multiple locations at the same time

(This becomes very practical with the docker service log command, available since 17.05.)

More about overlay networks

.blackbelt[DC17US: Deep Dive in Docker Overlay Networks (video)]

.blackbelt[DC17EU: Deeper Dive in Docker Overlay Networks (video)]

Troubleshooting overlay networks