Skip to main content

Observability Metrics Tutorial

This tutorial shows you how to set up a complete monitoring stack for ZeroTier networks using Prometheus and Grafana. You can use this type of monitoring pipeline to measure the health and activity of your networks, whether you're using our managed environment or hosting your own controller.

Overview

ZeroTier Inc doesn't have access to your traffic. We don't currently supply a monitoring "dashboard" for your networks and nodes, but you can build your own powerful observability system!

This tutorial includes an example docker-compose configuration and setup scripts to run a ZeroTier network along with an observability stack based on Prometheus and Grafana.

Prerequisites

To run this locally, you'll need the following tools installed:

  • Docker + docker-compose
  • jq (JSON command-line processor)

We've tested this demo on MacOS and Linux. On Windows, you'll need to use WSL2 for a local Linux environment. (Note: pull requests to add PowerShell support for the setup script are welcome!)

info

For this example, we're running a standalone controller, which allows us to avoid capturing and injecting a valid ZeroTier Central API token into the demo environment. However, the same observability tools and configuration methods will work for managed networks as well.

Setup

1. Get the Tutorial Files

Clone or download the tutorial repository:

git clone https://github.com/zerotier/metrics-tutorial.git
cd metrics-tutorial

2. Start the Stack

Run the docker-compose setup and initialization script:

docker-compose up -d
./setup.sh

This will start:

  • ZeroTier controller
  • Prometheus metrics collection
  • Grafana dashboard service
  • Example network configuration

Grafana Configuration

1. Access Grafana

Open http://localhost:3000/ in your browser.

2. Initial Login

  • Username: admin
  • Password: admin
  • Set a new password or click 'Skip'

3. Configure Prometheus Data Source

  1. Navigate to ConnectionsData SourcesAdd New Data Source
  2. Select Prometheus (first option)
  3. Set the Prometheus server URL: http://metrics-tutorial-prometheus-1:9090
  4. Click Save and test
  5. Click Explore view

4. Create Your First Query

  1. Select zt_packet in the Metric dropdown
  2. Choose Last 15 minutes in the range picker (upper-right)
  3. Click Run query

The graph will populate in the bottom half of the screen showing ZeroTier packet metrics.

5. Monitor Prometheus Jobs

You can also check the status of Prometheus jobs directly at: http://localhost:9090

Metrics Endpoint

URL: http://localhost:9993/metrics

Authentication: Requires token from the metricstoken.secret file in the ZeroTier home directory. This file is separate from the authtoken.secret file used for the main service API.

Disk output: Metrics are also written to metrics.prom in the ZeroTier home directory, enabling collection via the Prometheus Node Exporter textfile collector.

Available Metrics

ZeroTierOne exports ten metrics organized into three categories: general packet metrics, network-level metrics, and peer-level metrics. All metrics use the zt_ prefix.

General Packet Metrics

MetricTypeLabelsDescription
zt_packetCounterpacket_type, directionTotal packet counts by type and direction
zt_packet_errorCountererror_type, directionPacket errors by type and direction
zt_dataCounterprotocol, directionBytes transmitted/received by protocol

Network-Level Metrics

MetricTypeLabelsDescription
zt_num_networksGauge-Number of networks currently joined
zt_network_multicast_groups_subscribedGaugenetwork_idMulticast group subscriptions per network
zt_network_packetsCounternetwork_id, directionPackets sent/received per network

Peer-Level Metrics

MetricTypeLabelsDescription
zt_peer_latencyHistogramnode_idPeer-to-peer latency distribution (ms)
zt_peer_path_countGaugenode_id, statusNumber of paths to each peer by status
zt_peer_packetsCounternode_id, directionPackets sent/received per peer
zt_peer_packet_errorsCounternode_idPacket errors per peer

Label Values

LabelValuesDescription
directiontx, rxTransmit or receive
node_id10-char hexZeroTier node identifier
network_id16-char hexZeroTier network identifier
statusvariesPath status (active/inactive)
packet_typevariesZeroTier protocol packet type
error_typevariesError classification

Configuration

Enabling Metrics by Version

VersionMetrics Enabled By DefaultConfiguration Required
1.14.0 – 1.16.0YesNone
1.16.1+NoLocal config toggle

For version 1.16.1 and later, create or edit local.conf in the ZeroTier home directory:

Linux/macOS: /var/lib/zerotier-one/local.conf

Windows: C:\ProgramData\ZeroTier\One\local.conf

{
"settings": {
"allowMetrics": true
}
}

Restart the ZeroTier service after modifying this file.

Compile-Time Options

FlagDefaultEffect
ZT_ENABLE_METRICSEnabledEnables metrics compilation
ZT_NO_PEER_METRICSNot setDisables per-peer metrics to reduce cardinality

The ZT_NO_PEER_METRICS flag is useful for deployments with high peer counts where per-peer metric cardinality would create excessive load on the metrics collection system.

Implementation Files

The metrics implementation resides in:

FilePurpose
osdep/Metrics.hppMetric declarations and types
osdep/Metrics.cppMetric initialization and serialization
service/OneService.cppHTTP endpoint handler

Example Prometheus Configuration

scrape_configs:
- job_name: 'zerotier'
static_configs:
- targets: ['localhost:9993']
bearer_token_file: '/var/lib/zerotier-one/metricstoken.secret'

Example Queries

Total packets by direction:

sum by (direction) (rate(zt_packet[5m]))

Average peer latency:

histogram_quantile(0.95, rate(zt_peer_latency_bucket[5m]))

Network traffic by network:

sum by (network_id, direction) (rate(zt_network_packets[5m]))

Creating Custom Dashboards

Now that you have the basic pipeline running, you can create custom dashboards reporting ZeroTier metrics. Consider visualizing:

  • Network traffic patterns over time
  • Peer connectivity status
  • Error rates and network health
  • Geographic distribution of connections
  • Bandwidth utilization

Cleanup

When you're done with the tutorial, clean up the Docker containers:

./cleanup.sh

Alternative Monitoring Tools

Beyond Prometheus and Grafana, you can use your preferred monitoring tools over your ZeroTier networks:

Additional Resources

Community

If you've found a cool use for this set of observability hooks, we'd love to hear about it! Please reach out via: