README.md 2.99 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<!-- # SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License. -->

# Benchmarks

18
This directory contains benchmarking scripts and tools for performance evaluation of Dynamo deployments. The benchmarking framework is a wrapper around genai-perf that makes it easy to benchmark DynamoGraphDeployments or other deployments with exposed endpoints.
19
20
21

## Quick Start

22
23
### Benchmark a Dynamo Deployment
First, deploy your DynamoGraphDeployment using the [deployment documentation](../components/backends/), then:
24
25

```bash
26
# Port-forward your deployment to http://localhost:8000
27
kubectl port-forward -n <namespace> svc/<frontend-service-name> 8000:8000 > /dev/null 2>&1 &
28

29
# Run benchmark
30
python3 -m benchmarks.utils.benchmark \
31
32
    --input my-benchmark=http://localhost:8000 \
    --model "<your-model>"
33

34
35
# Generate plots
python3 -m benchmarks.utils.plot --data-dir ./benchmarks/results
36
37
38

# Or plot only specific benchmark experiments
python3 -m benchmarks.utils.plot --data-dir ./benchmarks/results --benchmark-name my-benchmark
39
40
41
42
```

## Features

43
Benchmark any HTTP endpoints! The benchmarking framework supports:
44
45
46
47
48
49
50

**Flexible Configuration:**
- User-defined labels for each input using `--input label=value` format
- Support for multiple inputs to enable comparisons
- Customizable concurrency levels (configurable via CONCURRENCIES env var), sequence lengths, and models
- Automated performance plot generation with custom labels

51
52
53
54
**Sequential Execution:**
- Benchmarks are run sequentially, not in parallel
- To avoid interference, ensure only one deployment is utilizing the target GPUs during a run
- This helps produce more comparable measurements across configurations
55

56
**Supported Backends:**
57
58
- DynamoGraphDeployments with port-forwarded endpoints
- External HTTP endpoints (for comparison with non-Dynamo backends or platforms)
59
60
61

## Installation

62
This is already included as part of the Dynamo container images. To install locally or standalone:
63
64
65
66
67

```bash
pip install -e .
```

68
69
70
## Data Generation Tools

This directory also includes lightweight tools for:
71
- Analyzing prefix-structured data (`datagen analyze`)
72
73
- Synthesizing structured data customizable for testing purposes (`datagen synthesize`)

74
75
76
77
78
Detailed information is provided in the `prefix_data_generator` directory.

## Comprehensive Guide

For detailed documentation, configuration options, and advanced usage, see the [complete benchmarking guide](../docs/benchmarks/benchmarking.md).