"deploy/helm/dependencies/nats-values.yaml" did not exist on "3983830e808167551bfa66a84e4476fe8f7212f6"
README.md 2.76 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
# Metrics Visualization with Prometheus and Grafana

This directory contains configuration for visualizing metrics from the metrics aggregation service using Prometheus and Grafana.

## Components

- **Prometheus**: Collects and stores metrics from the service
- **Grafana**: Provides visualization dashboards for the metrics

## Getting Started

1. Make sure Docker and Docker Compose are installed on your system

Neelay Shah's avatar
Neelay Shah committed
14
2. Start the `components/metrics` application to begin monitoring for metric events from dynamo workers
15
   and aggregating them on a prometheus metrics endpoint: `http://localhost:9091/metrics`.
16

17
18
19
20
21
3. Start worker(s) that publishes KV Cache metrics.
  - For quick testing, `examples/rust/service_metrics/bin/server.rs` can populate dummy KV Cache metrics.
  - For a real workflow with real data, see the KV Routing example in `examples/python_rs/llm/vllm`.

4. Start the visualization stack:
22
23

  ```bash
24
  docker compose --profile metrics up -d
25
26
  ```

27
5. Web servers started:
28
   - Grafana: `http://localhost:3001` (default login: admin/admin) (started by docker compose)
29
30
   - Prometheus Server: `http://localhost:9090` (started by docker compose)
   - Prometheus Metrics Endpoint: `http://localhost:9091/metrics` (started by `components/metrics` application)
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

## Configuration

### Prometheus

The Prometheus configuration is defined in `prometheus.yml`. It is configured to scrape metrics from the metrics aggregation service endpoint.

Note: You may need to adjust the target based on your host configuration and network setup.

### Grafana

Grafana is pre-configured with:
- Prometheus datasource
- Sample dashboard for visualizing service metrics

## Required Files

The following configuration files should be present in this directory:
49
- `..\docker-compose.yml`: Defines the Prometheus and Grafana services
50
51
52
53
54
55
56
- `prometheus.yml`: Contains Prometheus scraping configuration
- `grafana.json`: Contains Grafana dashboard configuration
- `grafana-datasources.yml`: Contains Grafana datasource configuration
- `grafana-dashboard-providers.yml`: Contains Grafana dashboard provider configuration

## Metrics

57
58
59
60
61
62
The prometheus metrics endpoint exposes the following metrics:
- `llm_requests_active_slots`: Number of currently active request slots per worker
- `llm_requests_total_slots`: Total available request slots per worker
- `llm_kv_blocks_active`: Number of active KV blocks per worker
- `llm_kv_blocks_total`: Total KV blocks available per worker
- `llm_kv_hit_rate_percent`: Cumulative KV Cache hit percent per worker
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
- `llm_load_avg`: Average load across workers
- `llm_load_std`: Load standard deviation across workers

## Troubleshooting

1. Verify services are running:
  ```bash
  docker compose ps
  ```

2. Check logs:
  ```bash
  docker compose logs prometheus
  docker compose logs grafana
  ```