"docs/kvbm/vllm-setup.md" did not exist on "fb12b67ff76c2140a9c7966ba5d55492f8097bd8"
prometheus-grafana.md 3.91 KB
Newer Older
1
<!--
2
SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3
4
SPDX-License-Identifier: Apache-2.0
-->
5

6
# Metrics Visualization with Prometheus and Grafana
7
8
9

## Overview

10
This guide shows how to set up Prometheus and Grafana for visualizing Dynamo metrics on a single machine for demo purposes.
11

12
![Grafana Dynamo Dashboard](./grafana-dynamo-composite.png)
13

14
15
16
**Components:**
- **Prometheus Server** - Collects and stores metrics from Dynamo services
- **Grafana** - Provides dashboards by querying the Prometheus Server
17

18
**For metrics reference**, see [Metrics Documentation](metrics.md).
19

20
## Environment Variables
21

22
23
24
| Variable | Description | Default | Example |
|----------|-------------|---------|---------|
| `DYN_SYSTEM_PORT` | System metrics/health port | `-1` (disabled) | `8081` |
25

26
## Getting Started Quickly
27

28
This is a single machine example.
29

30
### Start the Observability Stack
31

32
Start the observability stack (Prometheus, Grafana, Tempo, exporters). See [Observability Getting Started](README.md#getting-started-quickly) for instructions and prerequisites.
33

34
### Start Dynamo Components
35

36
Start frontend and worker (a simple single GPU example):
37

38
```bash
39
40
# Start frontend (default port 8000, override with --http-port or DYN_HTTP_PORT env var)
python -m dynamo.frontend &
41

42
43
# Start vLLM worker with metrics enabled on port 8081
DYN_SYSTEM_PORT=8081 python -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager
44
45
```

46
After the workers are running, send a few test requests to populate metrics in the system:
47
48

```bash
49
50
51
52
53
54
55
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen3-0.6B",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_completion_tokens": 100
  }'
56
57
```

58
59
60
After sending a few requests, the Prometheus Exposition Format text metrics are available at:
- Frontend: `http://localhost:8000/metrics`
- Backend worker: `http://localhost:8081/metrics`
61

62
### Access Web Interfaces
63

64
Once Dynamo components are running:
65

66
67
68
1. Open **Grafana** at `http://localhost:3000` (username: `dynamo`, password: `dynamo`)
2. Click on **Dashboards** in the left sidebar
3. Select **Dynamo Dashboard** to view metrics and traces
69

70
71
72
Other interfaces:
- **Prometheus**: `http://localhost:9090`
- **Tempo** (tracing): Accessible through Grafana's Explore view. See [Tracing Guide](tracing.md) for details.
73

74
**Note:** If accessing from another machine, replace `localhost` with the machine's hostname or IP address, and ensure firewall rules allow access to these ports (3000, 9090).
75

76
---
77

78
## Configuration
79

80
### Prometheus
81

82
The Prometheus configuration is specified in [prometheus.yml](../../deploy/observability/prometheus.yml). This file is set up to collect metrics from the metrics aggregation service endpoint.
83
84
85

Please be aware that you might need to modify the target settings to align with your specific host configuration and network environment.

86
After making changes to prometheus.yml, restart the Prometheus service. See [Observability Getting Started](README.md#getting-started-quickly) for Docker Compose commands.
87

88
### Grafana
89
90
91
92
93

Grafana is pre-configured with:
- Prometheus datasource
- Sample dashboard for visualizing service metrics

94
### Troubleshooting
95

96
1. Verify services are running using `docker compose ps`
97

98
2. Check logs using `docker compose logs`
99

100
3. Check Prometheus targets at `http://localhost:9090/targets` to verify metric collection.
101

102
4. If you encounter issues with stale data or configuration, stop services and wipe volumes using `docker compose down -v` then restart.
103

104
  **Note:** The `-v` flag removes named volumes (grafana-data, tempo-data), which will reset dashboards and stored metrics.
105

106
For specific Docker Compose commands, see [Observability Getting Started](README.md#getting-started-quickly).
107

108
## Developer Guide
109

110
For detailed information on creating custom metrics in Dynamo components, see:
111

112
- [Metrics Developer Guide](metrics-developer-guide.md)