# Metrics Visualization with Prometheus and Grafana

This directory contains configuration for visualizing metrics from the metrics aggregation service using Prometheus and Grafana.

## Components

- **Prometheus**: Collects and stores metrics from the service
- **Grafana**: Provides visualization dashboards for the metrics

## Getting Started

1. Make sure Docker and Docker Compose are installed on your system

2. Start the `components/metrics` application to begin monitoring for metric events from dynamo workers
   and aggregating them on a prometheus metrics endpoint: `http://localhost:9091/metrics`.

3. Start worker(s) that publishes KV Cache metrics.
  - For quick testing, `examples/rust/service_metrics/bin/server.rs` can populate dummy KV Cache metrics.
  - For a real workflow with real data, see the KV Routing example in `examples/python_rs/llm/vllm`.

4. Start the visualization stack:

  ```bash
  docker compose --profile metrics up -d
  ```

5. Web servers started:
   - Grafana: `http://localhost:3001` (default login: admin/admin) (started by docker compose)
   - Prometheus Server: `http://localhost:9090` (started by docker compose)
   - Prometheus Metrics Endpoint: `http://localhost:9091/metrics` (started by `components/metrics` application)

## Configuration

### Prometheus

The Prometheus configuration is defined in `prometheus.yml`. It is configured to scrape metrics from the metrics aggregation service endpoint.

Note: You may need to adjust the target based on your host configuration and network setup.

### Grafana

Grafana is pre-configured with:
- Prometheus datasource
- Sample dashboard for visualizing service metrics

## Required Files

The following configuration files should be present in this directory:
- `docker-compose.yml`: Defines the Prometheus and Grafana services
- `prometheus.yml`: Contains Prometheus scraping configuration
- `grafana.json`: Contains Grafana dashboard configuration
- `grafana-datasources.yml`: Contains Grafana datasource configuration
- `grafana-dashboard-providers.yml`: Contains Grafana dashboard provider configuration

## Metrics

The prometheus metrics endpoint exposes the following metrics:
- `llm_requests_active_slots`: Number of currently active request slots per worker
- `llm_requests_total_slots`: Total available request slots per worker
- `llm_kv_blocks_active`: Number of active KV blocks per worker
- `llm_kv_blocks_total`: Total KV blocks available per worker
- `llm_kv_hit_rate_percent`: Cumulative KV Cache hit percent per worker
- `llm_load_avg`: Average load across workers
- `llm_load_std`: Load standard deviation across workers

## Troubleshooting

1. Verify services are running:
  ```bash
  docker compose ps
  ```

2. Check logs:
  ```bash
  docker compose logs prometheus
  docker compose logs grafana
  ```