Measuring cluster-wide network conditions
- Measuring cluster-wide network conditions
- Viewing network conditions with Prometheus
- Viewing network conditions with Grafana
Measuring cluster-wide network conditions
Since we have built-in, cluster-wide discovery, it's relatively straightforward to monitor the whole cluster automatically
Alexandros Mavrogiannis wrote Swarm NBT, a tool doing exactly that!
- Start Swarm NBT:
docker run --rm -v inventory:/inventory \ -v /var/run/docker.sock:/var/run/docker.sock \ alexmavr/swarm-nbt start
Note: in this mode, Swarm NBT connects to the Docker API socket, and issues additional API requests to start all the components it needs.
Viewing network conditions with Prometheus
Swarm NBT relies on Prometheus to scrape and store data
We can directly consume the Prometheus endpoint to view telemetry data
Point your browser to any Swarm node, on port 9090
(If you're using Play-With-Docker, click on the (9090) badge)
In the drop-down, select
icmp_rtt_gauge_seconds
Click on "Graph"
You are now seeing ICMP latency across your cluster.
Viewing network conditions with Grafana
- If you are using a "real" cluster (not Play-With-Docker) you can use Grafana
- Start Grafana with
docker service create -p 3000:3000 grafana
- Point your browser to Grafana, on port 3000 on any Swarm node
- Login with username
admin
and passwordadmin
- Click on the top-left menu and browse to Data Sources
- Create a prometheus datasource with any name
- Point it to http://any-node-IP:9090
- Set access to "direct" and leave credentials blank
- Click on the top-left menu, highlight "Dashboards" and select the "Import" option
- Copy-paste this JSON payload, then use the Prometheus Data Source defined before
- Poke around the dashboard that magically appeared!