README.md 7.57 KB
Newer Older
1
# Metrics
2

3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
⚠️ **DEPRECATION NOTICE** ⚠️

**This `metrics` component is unmaintained and being deprecated.**

The deprecated `metrics` component is being replaced by the **`MetricsRegistry`** built-in functionality that is now available directly in the `DistributedRuntime` framework. The `MetricsRegistry` provides:

**For new projects and existing deployments, please migrate to using `MetricsRegistry` instead of this component.**

This component may be migrated to the MetricsRegistry in the future.

**📖 See the [Dynamo MetricsRegistry Guide](../../docs/guides/metrics.md) for detailed information on using the new metrics system.**

---

The deprecated `metrics` component is a utility for collecting, aggregating, and publishing metrics from a Dynamo deployment, but it is unmaintained and being deprecated in favor of `MetricsRegistry`.

**Note**: This is a demo implementation. The deprecated `metrics` component is no longer under active development.
- In this demo the metrics names use the prefix "llm", but in production they will be prefixed with "dynamo" (e.g., the HTTP `/metrics` endpoint will serve metrics with "dynamo" prefixes)
21
- This demo will only work when using examples/llm/configs/agg.yml-- other configurations will not work
22

23
24
25
26
<div align="center">
  <img src="images/dynamo_metrics_grafana.png" alt="Dynamo Metrics Dashboard"/>
</div>

27
28
## Quickstart

29
To start the deprecated `metrics` component, simply point it at the `namespace/component/endpoint`
30
trio for the Dynamo workers that you're interested in monitoring metrics on.
31

32
This will:
33
34
35
1. Collect statistics from workers associated with that `namespace/component/endpoint`
2. Postprocess and aggregate those statistics across the workers
3. Publish them on a Prometheus-compatible metrics endpoint
36
37

For example:
38
```bash
39
40
# Default namespace is "dynamo", but can be configured with --namespace
# For more detailed output, try setting the env var: DYN_LOG=debug
41
metrics --component MyComponent --endpoint my_endpoint
42

43
# 2025-03-17T00:07:05.202558Z  INFO metrics: Scraping endpoint dynamo/MyComponent/my_endpoint for stats
44
# 2025-03-17T00:07:05.202955Z  INFO metrics: Prometheus metrics server started at 0.0.0.0:9091/metrics
45
46
47
# ...
```

48
With no matching endpoints running to collect stats from, you should see warnings in the logs:
49
```bash
50
2025-03-17T00:07:06.204756Z  WARN metrics: No endpoints found matching dynamo/MyComponent/my_endpoint
51
52
```

53
54
After a worker with a matching endpoint gets started, the endpoint
will get automatically discovered and the warnings will stop.
55

56
## Workers
57

58
The deprecated `metrics` component needs running workers to gather metrics from,
59
60
61
62
so below are some examples of workers and how they can be monitored.

### Mock Worker

63
To try out how the deprecated `metrics` component works, there is a demo Rust-based
64
[mock worker](src/bin/mock_worker.rs) that provides sample data through two mechanisms:
65
1. Exposes a stats handler at `dynamo/MyComponent/my_endpoint` that responds to polling requests (from the deprecated `metrics` component) with randomly generated `ForwardPassMetrics` data
66
2. Publishes mock `KVHitRateEvent` data every second to demonstrate event-based metrics
67

68
Step 1: Launch a mock workers via the following command (if already built):
69
```bash
70
# or build/run from source: DYN_LOG=DEBUG cargo run --bin mock_worker
71
72
mock_worker

73
# 2025-03-16T23:49:28.101668Z  INFO mock_worker: Starting Mock Worker on Endpoint: dynamo/MyComponent/my_endpoint
74
75
```

76
77
Step 2: Monitor the metrics of these mock workers, and prepare its own Prometheus endpoint at
port 9091 (a default, when --port is not specified) on /metrics:
78
```bash
79
metrics --component MyComponent --endpoint my_endpoint
80
81
82
83
84
```

### Real Worker

To run a more realistic deployment to gathering metrics from,
85
see the examples in [examples/llm](../../examples/llm).
86
87

```bash
88
89
python -m dynamo.frontend &
python -m dynamo.vllm --model-path <your-model-checkout>
90
91
```

92
Then, to monitor the metrics of these VllmWorkers, run:
93
```bash
94
metrics --component backend --endpoint load_metrics
95
96
97
98
```

**NOTE**: `load_metrics` is currently a
[hard-coded](https://github.com/ai-dynamo/dynamo/blob/d5220c7b1151372ba3d2a061c7d0a7ed72724789/lib/llm/src/kv_router/publisher.rs#L108)
99
endpoint name used for python-based workers that register a `WorkerMetricsPublisher`.
100
101

## Visualization
102

103
104
To visualize the metrics being exposed on the Prometheus endpoint,
see the Prometheus and Grafana configurations in
105
[deploy/metrics](../../deploy/metrics):
106
```bash
107
docker compose -f deploy/docker-compose.yml --profile metrics up -d
108
```
109
110
111

## Metrics Collection Modes

112
The deprecated `metrics` component supports two modes for exposing metrics in a Prometheus format:
113
114
115

### Pull Mode (Default)

116
When running in pull mode (the default), the deprecated `metrics` component will expose a
117
118
Prometheus metrics endpoint on the specified host and port that a
Prometheus server or curl client can pull from:
119
120
121

```bash
# Start metrics server on default host (0.0.0.0) and port (9091)
122
metrics --component MyComponent --endpoint my_endpoint
123
124

# Or specify a custom port
125
metrics --component MyComponent --endpoint my_endpoint --port 9092
126
127
```

128
129
130
131
132
In pull mode:
- The `--host` parameter must be a valid IPv4 or IPv6 address (e.g., "0.0.0.0", "127.0.0.1")
- The `--port` parameter specifies which port the HTTP server will listen on

You can then query the metrics using:
133
134
135
136
137
```bash
curl localhost:9091/metrics

# # HELP llm_kv_blocks_active Active KV cache blocks
# # TYPE llm_kv_blocks_active gauge
138
139
# llm_kv_blocks_active{component="MyComponent",endpoint="my_endpoint",worker_id="7587884888253033398"} 40
# llm_kv_blocks_active{component="MyComponent",endpoint="my_endpoint",worker_id="7587884888253033401"} 2
140
141
# # HELP llm_kv_blocks_total Total KV cache blocks
# # TYPE llm_kv_blocks_total gauge
142
143
# llm_kv_blocks_total{component="MyComponent",endpoint="my_endpoint",worker_id="7587884888253033398"} 100
# llm_kv_blocks_total{component="MyComponent",endpoint="my_endpoint",worker_id="7587884888253033401"} 100
144
```
145

146
147
### Push Mode

148
For ephemeral or batch jobs, or when metrics need to be pushed through a firewall,
149
you can use Push mode. In this mode, the deprecated `metrics` component will periodically push
150
151
metrics to an externally hosted
[Prometheus PushGateway](https://prometheus.io/docs/instrumenting/pushing/):
152
153
154
155
156
157

Start a prometheus push gateway service via docker:
```bash
docker run --rm -d -p 9091:9091 --name pushgateway prom/pushgateway
```

158
Start the deprecated `metrics` component in `--push` mode, specifying the host and port of your PushGateway:
159
160
```bash
# Push metrics to a Prometheus PushGateway every --push-interval seconds
161
metrics \
162
    --component MyComponent \
163
    --endpoint my_endpoint \
164
165
166
167
168
169
    --host 127.0.0.1 \
    --port 9091 \
    --push
```

When using Push mode:
170
171
172
- The `--host` parameter must be a valid IPv4 or IPv6 address (e.g., "0.0.0.0", "127.0.0.1")
  that the Prometheus PushGateway is running on
- The `--port` parameter specifies the port of the Prometheus PushGateway
173
174
175
176
177
178
179
180
181
182
183
- The push interval can be configured with `--push-interval` (default: 2 seconds)
- A default job name of "dynamo_metrics" is used for the Prometheus job label
- Metrics persist in the PushGateway until explicitly deleted
- Prometheus should be configured to scrape the PushGateway with `honor_labels: true`

To view the metrics hosted on the PushGateway:
```bash
# View all metrics
# curl http://<pushgateway_ip>:<pushgateway_port>/metrics
curl 127.0.0.1:9091/metrics
```
184
## Building/Running from Source
185

186
For easy iteration while making edits to the deprecated `metrics` component, you can use `cargo run`
187
to build and run with your local changes:
188
189

```bash
190
cargo run --bin metrics -- --component MyComponent --endpoint my_endpoint
191
192
```

193