@@ -29,95 +29,46 @@ This guide covers single GPU demo setup using Docker Compose. For Kubernetes dep
...
@@ -29,95 +29,46 @@ This guide covers single GPU demo setup using Docker Compose. For Kubernetes dep
Start the observability stack (Prometheus, Grafana, Tempo, exporters). See [Observability Getting Started](README.md#getting-started-quickly) for instructions.
Start the observability stack (Prometheus, Grafana, Tempo, exporters). See [Observability Getting Started](README.md#getting-started-quickly) for instructions.
### 2. Set Environment Variables
### 2. Start Dynamo Components (Single GPU)
Configure Dynamo components to export traces:
For a simple single-GPU deployment, run the aggregated tracing launch script. This script enables tracing, sets per-component service names, and starts a frontend with a single vLLM worker:
Run the vLLM disaggregated script with tracing enabled:
For a disaggregated deployment with tracing, run the disaggregated tracing launch script. This script sets up tracing and launches a frontend, a decode worker on GPU 0, and a prefill worker on GPU 1:
```bash
```bash
# Navigate to vLLM launch directory
cd examples/backends/vllm/launch
cd examples/backends/vllm/launch
./disagg_tracing.sh
# Export tracing env vars, then run the disaggregated deployment script.
./disagg.sh
```
```
**Note:** the example vLLM `disagg.sh` sets per-worker `--kv-events-config` with unique ZMQ endpoints and unique
This separates prefill and decode onto different GPUs for better resource utilization.
`VLLM_NIXL_SIDE_CHANNEL_PORT` values to avoid "Address already in use" conflicts when multiple workers run on the same host. If you run the components manually, make sure you mirror those settings.
For disaggregated deployments, this separates prefill and decode onto different GPUs for better resource utilization.
### 4. Generate Traces
Send requests to the frontend to generate traces (works for both aggregated and disaggregated deployments). The launch scripts print an example `curl` command on startup with the correct model name.
Send requests to the frontend to generate traces (works for both aggregated and disaggregated deployments). **Note the `x-request-id` header**, which allows you to easily search for and correlate this specific trace in Grafana:
**Tip:** Add an`x-request-id` header to easily search for a specific trace in Grafana:
```bash
```bash
curl -H'Content-Type: application/json'\
curl -H'Content-Type: application/json'\
-H'x-request-id: test-trace-001'\
-H'x-request-id: test-trace-001'\
-d'{
-d'{
"model": "Qwen/Qwen3-0.6B",
"model": "<MODEL>",
"max_completion_tokens": 100,
"max_completion_tokens": 100,
"messages": [
"messages": [
{"role": "user", "content": "What is the capital of France?"}
{"role": "user", "content": "What is the capital of France?"}
2. Login with username `dynamo` and password `dynamo`
2. Login with username `dynamo` and password `dynamo`
...
@@ -145,7 +96,7 @@ Below is an example of what a trace looks like in Grafana Tempo:
...
@@ -145,7 +96,7 @@ Below is an example of what a trace looks like in Grafana Tempo:


### 6. Stop Services
### 5. Stop Services
When done, stop the observability stack. See [Observability Getting Started](README.md#getting-started-quickly) for Docker Compose commands.
When done, stop the observability stack. See [Observability Getting Started](README.md#getting-started-quickly) for Docker Compose commands.
...
@@ -157,56 +108,17 @@ For Kubernetes deployments, ensure you have a Tempo instance deployed and access
...
@@ -157,56 +108,17 @@ For Kubernetes deployments, ensure you have a Tempo instance deployed and access
### Modify DynamoGraphDeployment for Tracing
### Modify DynamoGraphDeployment for Tracing
Add common tracing environment variables at the top level and service-specific names in each component in your `DynamoGraphDeployment` (e.g., `examples/backends/vllm/deploy/disagg.yaml`):
Tracing-enabled variants of the example deployments are provided:
These add the [Environment Variables](#environment-variables) to the base `agg.yaml` / `disagg.yaml` deployments. To override the Tempo endpoint, edit `OTEL_EXPORTER_OTLP_TRACES_ENDPOINT` in the YAML.