"docs/vscode:/vscode.git/clone" did not exist on "136825de756f5421283e404e3991b77a9d33c131"
Unverified Commit c82f477d authored by hhzhang16's avatar hhzhang16 Committed by GitHub
Browse files

feat: decouple dynamo k8s setup from additional benchmarking requirements (#2973)


Signed-off-by: default avatarHannah Zhang <hannahz@nvidia.com>
parent b96a5a6f
......@@ -98,8 +98,8 @@ NOTE:
- To benchmark non-Dynamo backends (vLLM, TensorRT-LLM, SGLang, etc.), deploy them
manually following their Kubernetes deployment guides, expose a port (i.e. via port-forward),
and use the endpoint option.
- For Dynamo deployment setup, setup_k8s_namespace.sh provides fully encapsulated
deployment setup including namespace creation, CRDs, and operator installation.
- For Dynamo deployment setup, follow the main installation guide at docs/guides/dynamo_deploy/installation_guide.md
to install the platform, then use setup_benchmarking_resources.sh for benchmarking resources.
- The --model flag configures GenAI-Perf and should match what's configured in your deployment manifests and endpoints.
- Only one model can be benchmarked at a time across all inputs.
......
# Kubernetes utilities for Dynamo
# Kubernetes utilities for Dynamo Benchmarking and Profiling
This directory contains small utilities and manifests used by benchmarking and profiling flows.
This directory contains utilities and manifests for Dynamo benchmarking and profiling workflows.
## Prerequisites
**Before using these utilities, you must first set up Dynamo Cloud following the main installation guide:**
👉 **[Follow the Dynamo Cloud installation guide](../../guides/dynamo_deploy/installation_guide.md) to install the Dynamo Kubernetes Platform first.**
This includes:
1. Installing the Dynamo CRDs
2. Installing the Dynamo Platform (operator, etcd, NATS)
3. Setting up your target namespace
## Contents
- `setup_k8s_namespace.sh`**fully encapsulated deployment setup** that provides one-time per Kubernetes namespace setup. Creates namespace (if missing), applies common manifests, installs CRDs, and deploys the Dynamo operator. If `DOCKER_SERVER`/`IMAGE_TAG` are provided, it installs your custom operator image; otherwise it installs the default published image. If your registry is private, provide `DOCKER_USERNAME`/`DOCKER_PASSWORD` or respond to the prompt to create an image pull secret.
- `setup_benchmarking_resources.sh` — Sets up benchmarking and profiling resources in your existing Dynamo namespace
- `manifests/`
- `serviceaccount.yaml` — ServiceAccount `dynamo-sa`
- `role.yaml` — Role `dynamo-role`
- `serviceaccount.yaml` — ServiceAccount `dynamo-sa` for benchmarking and profiling jobs
- `role.yaml` — Role `dynamo-role` with necessary permissions
- `rolebinding.yaml` — RoleBinding `dynamo-binding`
- `pvc.yaml` — PVC `dynamo-pvc`
- `pvc.yaml` — PVC `dynamo-pvc` for storing profiler results and configurations
- `pvc-access-pod.yaml` — short‑lived pod for copying profiler results from the PVC
- `kubernetes.py` — helper used by tooling to apply/read resources (e.g., access pod for PVC downloads).
- `kubernetes.py` — helper used by tooling to apply/read resources (e.g., access pod for PVC downloads)
- `inject_manifest.py` — utility for injecting deployment configurations into the PVC for profiling
- `download_pvc_results.py` — utility for downloading benchmark/profiling results from the PVC
- `dynamo_deployment.py` — utilities for working with DynamoGraphDeployment resources
- `requirements.txt` — Python dependencies for benchmarking utilities
## Quick start
### Kubernetes Setup (one-time per namespace)
### Benchmarking Resource Setup
Use the helper script to prepare a Kubernetes namespace with the common manifests and install the operator. This provides a **fully encapsulated deployment setup**.
This script creates a Kubernetes namespace with the given name if it does not yet exist. It then applies common manifests (serviceaccount, role, rolebinding, pvc), installs CRDs, creates secrets, and deploys the Dynamo Cloud Operator to your namespace.
If your namespace is already set up, you can skip this step.
After setting up Dynamo Cloud, use this script to prepare your namespace with the additional resources needed for benchmarking and profiling workflows:
```bash
export HF_TOKEN=<HF_TOKEN>
export DOCKER_SERVER=<YOUR_DOCKER_SERVER>
NAMESPACE=benchmarking HF_TOKEN=$HF_TOKEN DOCKER_SERVER=$DOCKER_SERVER deploy/utils/setup_k8s_namespace.sh
export NAMESPACE=your-dynamo-namespace
export HF_TOKEN=<HF_TOKEN> # Optional: for HuggingFace model access
# IF you want to build and push a new Docker image for the Dynamo Cloud Operator, include an IMAGE_TAG
# NAMESPACE=benchmarking HF_TOKEN=$HF_TOKEN DOCKER_SERVER=$DOCKER_SERVER IMAGE_TAG=latest deploy/utils/setup_k8s_namespace.sh
deploy/utils/setup_benchmarking_resources.sh
```
This script applies the following manifests:
This script applies the following manifests to your existing Dynamo namespace:
- `deploy/utils/manifests/serviceaccount.yaml` - ServiceAccount `dynamo-sa`
- `deploy/utils/manifests/role.yaml` - Role `dynamo-role`
- `deploy/utils/manifests/rolebinding.yaml` - RoleBinding `dynamo-binding`
- `deploy/utils/manifests/pvc.yaml` - PVC `dynamo-pvc`
If `DOCKER_SERVER` and `IMAGE_TAG` are not both provided, the script deploys the operator using the default published image `nvcr.io/nvidia/ai-dynamo/kubernetes-operator:0.4.0`.
To build/push and use a new image instead, pass both `DOCKER_SERVER` and `IMAGE_TAG`.
This script also installs the Dynamo CRDs if not present.
If the registry is private, either pass credentials or respond to the prompt:
If `HF_TOKEN` is provided, it also creates a secret for HuggingFace model access.
```bash
NAMESPACE=benchmarking \
DOCKER_SERVER=my-registry.example.com \
IMAGE_TAG=latest \
DOCKER_USERNAME="$oauthtoken" \
DOCKER_PASSWORD=<token> \
deploy/utils/setup_k8s_namespace.sh
```
If `DOCKER_SERVER`/`IMAGE_TAG` are omitted, the script installs the default operator image `nvcr.io/nvidia/ai-dynamo/kubernetes-operator:0.4.0`.
After running the setup script, verify the installation by checking the pods:
After running the setup script, verify the resources by checking:
```bash
kubectl get pods -n $NAMESPACE
```
The output should look something like:
```
NAME READY STATUS RESTARTS AGE
dynamo-platform-dynamo-operator-controller-manager-xxxxx 2/2 Running 0 5m
dynamo-platform-etcd-0 1/1 Running 0 5m
dynamo-platform-nats-0 2/2 Running 0 5m
dynamo-platform-nats-box-xxxxx 1/1 Running 0 5m
kubectl get serviceaccount dynamo-sa -n $NAMESPACE
kubectl get pvc dynamo-pvc -n $NAMESPACE
```
### PVC Manipulation Scripts
......@@ -120,11 +104,11 @@ python3 -m deploy.utils.download_pvc_results \
#### Next Steps
For complete benchmarking workflows:
For complete benchmarking and profiling workflows:
- **Benchmarking Guide**: See [docs/benchmarks/benchmarking.md](../../docs/benchmarks/benchmarking.md) for comparing DynamoGraphDeployments and external endpoints
- **Pre-Deployment Profiling**: See [docs/benchmarks/pre_deployment_profiling.md](../../docs/benchmarks/pre_deployment_profiling.md) for optimizing configurations before deployment
## Notes
- Benchmarking scripts (`benchmarks/benchmark.sh`, `benchmarks/deploy_benchmark.sh`) call this setup automatically when present.
- Profiling job manifest remains in `benchmarks/profiler/deploy/profile_sla_job.yaml` and now relies on the common ServiceAccount/PVC here.
- Profiling job manifest remains in `benchmarks/profiler/deploy/profile_sla_job.yaml` and relies on the ServiceAccount/PVC created by the setup script.
- This setup is focused on benchmarking and profiling resources only - the main Dynamo platform must be installed separately.
#!/bin/bash
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
set -euo pipefail
# Resolve repo root relative to this script
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Inputs
NAMESPACE="${NAMESPACE:-default}"
HF_TOKEN="${HF_TOKEN:-}"
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log() { echo -e "${BLUE}[INFO]${NC} $*"; }
ok() { echo -e "${GREEN}[OK]${NC} $*"; }
warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
err() { echo -e "${RED}[ERROR]${NC} $*"; }
usage() {
cat << EOF
Usage:
NAMESPACE=<ns> [HF_TOKEN=<token>] deploy/utils/setup_benchmarking_resources.sh
Sets up benchmarking and profiling resources in an existing Dynamo namespace:
- Applies common manifests (ServiceAccount, Role, RoleBinding, PVC)
- Creates HuggingFace token secret if HF_TOKEN provided
- Installs benchmark dependencies if requirements.txt exists
Prerequisites:
- Dynamo Cloud platform must already be installed in the namespace
- kubectl must be configured and pointing to the target cluster
Environment variables:
NAMESPACE Target Kubernetes namespace (default: default)
HF_TOKEN Hugging Face token; if set, a secret named hf-token-secret is created (optional)
EOF
}
if ! command -v kubectl &>/dev/null; then err "kubectl not found"; exit 1; fi
# Check if namespace exists
if ! kubectl get namespace "$NAMESPACE" &>/dev/null; then
err "Namespace $NAMESPACE does not exist. Please create it first or install Dynamo Cloud platform."
exit 1
fi
# Check if Dynamo platform is installed
if ! kubectl get pods -n "$NAMESPACE" | grep -q "dynamo-platform"; then
warn "Dynamo platform pods not found in namespace $NAMESPACE"
warn "Please ensure Dynamo Cloud platform is installed first:"
warn " See: docs/guides/dynamo_deploy/installation_guide.md"
if [[ -z "${FORCE:-}" && -z "${YES:-}" ]]; then
read -p "Continue anyway? [y/N]: " -r ans
[[ "$ans" =~ ^[Yy]$ ]] || exit 1
else
warn "Continuing due to FORCE/YES set."
fi
fi
# Apply common manifests
log "Applying benchmarking manifests to namespace $NAMESPACE"
export NAMESPACE # ensure envsubst can see it
for mf in "$(dirname "$0")/manifests"/*.yaml; do
if [[ -f "$mf" ]]; then
if command -v envsubst >/dev/null 2>&1; then
envsubst < "$mf" | kubectl -n "$NAMESPACE" apply -f -
else
warn "envsubst not found; applying manifest without substitution: $mf"
kubectl -n "$NAMESPACE" apply -f "$mf"
fi
fi
done
ok "Benchmarking manifests applied"
# Optional: Create Hugging Face token secret if HF_TOKEN provided
if [[ -n "$HF_TOKEN" ]]; then
kubectl create secret generic hf-token-secret \
--from-literal=HF_TOKEN="$HF_TOKEN" \
-n "$NAMESPACE" \
--dry-run=client -o yaml | kubectl apply -f -
ok "hf-token-secret created/updated"
fi
ok "Benchmarking resource setup complete"
# Verify installation
log "Verifying installation..."
kubectl get serviceaccount dynamo-sa -n "$NAMESPACE" >/dev/null && ok "ServiceAccount dynamo-sa exists" || err "ServiceAccount dynamo-sa not found"
kubectl get pvc dynamo-pvc -n "$NAMESPACE" >/dev/null && ok "PVC dynamo-pvc exists" || err "PVC dynamo-pvc not found"
if [[ -n "$HF_TOKEN" ]]; then
kubectl get secret hf-token-secret -n "$NAMESPACE" >/dev/null && ok "Secret hf-token-secret exists" || err "Secret hf-token-secret not found"
fi
\ No newline at end of file
#!/bin/bash
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
set -euo pipefail
# Resolve repo root relative to this script
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Inputs
NAMESPACE="${NAMESPACE:-default}"
DOCKER_SERVER="${DOCKER_SERVER:-}"
IMAGE_TAG="${IMAGE_TAG:-}"
DOCKER_USERNAME="${DOCKER_USERNAME:-}"
DOCKER_PASSWORD="${DOCKER_PASSWORD:-}"
HF_TOKEN="${HF_TOKEN:-}"
PULL_SECRET_NAME=""
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log() { echo -e "${BLUE}[INFO]${NC} $*"; }
ok() { echo -e "${GREEN}[OK]${NC} $*"; }
warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
err() { echo -e "${RED}[ERROR]${NC} $*"; }
create_or_update_pull_secret() {
local server="$1"; local user="$2"; local pass="$3"
if [[ -n "$server" && -n "$user" && -n "$pass" ]]; then
log "Creating/updating docker-imagepullsecret for $server in namespace $NAMESPACE"
kubectl create secret docker-registry docker-imagepullsecret \
--docker-server="$server" \
--docker-username="$user" \
--docker-password="$pass" \
--namespace="$NAMESPACE" \
--dry-run=client -o yaml | kubectl apply -f -
ok "docker-imagepullsecret configured"
PULL_SECRET_NAME="docker-imagepullsecret"
fi
}
usage() {
cat << EOF
Usage:
NAMESPACE=<ns> deploy/utils/setup_k8s_namespace.sh
NAMESPACE=<ns> DOCKER_SERVER=<registry> IMAGE_TAG=<tag> [DOCKER_USERNAME=<user>] [DOCKER_PASSWORD=<token>] \
deploy/utils/setup_k8s_namespace.sh
Sets up Kubernetes namespace for Dynamo (one-time per namespace):
- Creates namespace if absent
- Applies common manifests (ServiceAccount, Role, RoleBinding, PVC)
- Installs CRDs once per cluster (if not already installed)
- If DOCKER_SERVER/IMAGE_TAG are provided:
* Builds/pushes a custom operator image with Earthly
* Installs/updates the operator Helm release using that image
* If credentials (DOCKER_USERNAME/DOCKER_PASSWORD) are provided, creates/updates docker-imagepullsecret
* If credentials are not provided, prompts interactively to create the pull secret
- Otherwise installs the operator using default image: nvcr.io/nvidia/ai-dynamo/kubernetes-operator:0.4.1
Environment variables:
NAMESPACE Target Kubernetes namespace (default: default)
DOCKER_SERVER Registry server for operator image (optional)
IMAGE_TAG Image tag for operator (optional)
DOCKER_USERNAME Registry username (optional; if provided with DOCKER_PASSWORD, secret is created)
DOCKER_PASSWORD Registry password/token (optional)
HF_TOKEN Hugging Face token; if set, a secret named hf-token-secret is created in the namespace (optional)
EOF
}
if ! command -v kubectl &>/dev/null; then err "kubectl not found"; exit 1; fi
# 1) Ensure namespace exists
if ! kubectl get namespace "$NAMESPACE" &>/dev/null; then
log "Creating namespace $NAMESPACE"
kubectl create namespace "$NAMESPACE"
else
log "Namespace $NAMESPACE exists"
fi
# 2) Apply common manifests
log "Applying common manifests to namespace $NAMESPACE"
for mf in "$(dirname "$0")/manifests"/*.yaml; do
envsubst < "$mf" | kubectl apply -f -
done
ok "Common manifests applied"
# 3) Install CRDs once per cluster (only if not already installed)
if command -v helm &>/dev/null; then
if ! helm status dynamo-crds -n "$NAMESPACE" &>/dev/null; then
log "Installing CRDs via Helm release dynamo-crds in namespace $NAMESPACE"
pushd "$REPO_ROOT/deploy/cloud/helm" >/dev/null
helm upgrade --install dynamo-crds ./crds/ \
--namespace "$NAMESPACE" \
--wait \
--atomic
popd >/dev/null
ok "CRDs installed"
fi
fi
# 4) Optional: Create Hugging Face token secret if HF_TOKEN provided
if [[ -n "$HF_TOKEN" ]]; then
kubectl create secret generic hf-token-secret \
--from-literal=HF_TOKEN="$HF_TOKEN" \
-n "$NAMESPACE" \
--dry-run=client -o yaml | kubectl apply -f -
ok "hf-token-secret created/updated"
fi
# 5) Optional: Create imagePullSecret for private registry if credentials provided or requested
if [[ -n "$DOCKER_SERVER" ]]; then
if [[ -n "$DOCKER_USERNAME" && -n "$DOCKER_PASSWORD" ]]; then
create_or_update_pull_secret "$DOCKER_SERVER" "$DOCKER_USERNAME" "$DOCKER_PASSWORD"
elif [[ -n "$IMAGE_TAG" ]]; then
echo
read -p "Do you need image pull credentials for $DOCKER_SERVER (private registry)? [y/N]: " -r ans
if [[ "$ans" =~ ^[Yy]$ ]]; then
read -p "Docker username (often '$oauthtoken' for NGC): " DOCKER_USERNAME
read -s -p "Docker password/token: " DOCKER_PASSWORD; echo
if [[ -n "$DOCKER_USERNAME" && -n "$DOCKER_PASSWORD" ]]; then
create_or_update_pull_secret "$DOCKER_SERVER" "$DOCKER_USERNAME" "$DOCKER_PASSWORD"
else
warn "Username or password empty; skipping secret creation"
fi
fi
fi
fi
# 6) Operator: Build/push custom image if both vars provided, else use default NGC image
if [[ -n "$DOCKER_SERVER" && -n "$IMAGE_TAG" ]]; then
if ! command -v earthly &>/dev/null; then warn "earthly not found; skipping operator build/push"; else
log "Building and pushing operator images via earthly"
earthly --push +all-docker --DOCKER_SERVER="$DOCKER_SERVER" --IMAGE_TAG="$IMAGE_TAG"
fi
if ! command -v helm &>/dev/null; then warn "helm not found; skipping helm install"; else
pushd "$REPO_ROOT/deploy/cloud/helm/platform" >/dev/null
helm dep build
popd >/dev/null
pushd "$REPO_ROOT/deploy/cloud/helm" >/dev/null
# Build Helm args
HELM_ARGS=(upgrade dynamo-platform ./platform/ --install --namespace "$NAMESPACE" \
--set "dynamo-operator.controllerManager.manager.image.repository=${DOCKER_SERVER}/dynamo-operator" \
--set "dynamo-operator.controllerManager.manager.image.tag=${IMAGE_TAG}")
if [[ -n "$PULL_SECRET_NAME" ]]; then
HELM_ARGS+=(--set "dynamo-operator.imagePullSecrets[0].name=${PULL_SECRET_NAME}")
fi
helm "${HELM_ARGS[@]}"
popd >/dev/null
ok "Helm chart installed/updated"
fi
else
# Use default published image when custom not provided
DEFAULT_OPERATOR_IMAGE="nvcr.io/nvidia/ai-dynamo/kubernetes-operator:0.4.1"
if ! command -v helm &>/dev/null; then warn "helm not found; skipping helm install"; else
pushd "$REPO_ROOT/deploy/cloud/helm/platform" >/dev/null
helm dep build
popd >/dev/null
pushd "$REPO_ROOT/deploy/cloud/helm" >/dev/null
# Only set imagePullSecrets if the referenced secret exists; otherwise rely on SA
HELM_ARGS=(upgrade dynamo-platform ./platform/ --install --namespace "$NAMESPACE" \
--set "dynamo-operator.controllerManager.manager.image.repository=${DEFAULT_OPERATOR_IMAGE%:*}" \
--set "dynamo-operator.controllerManager.manager.image.tag=${DEFAULT_OPERATOR_IMAGE##*:}")
if kubectl get secret nvcr-imagepullsecret -n "$NAMESPACE" &>/dev/null; then
HELM_ARGS+=(--set "dynamo-operator.imagePullSecrets[0].name=nvcr-imagepullsecret")
fi
helm "${HELM_ARGS[@]}"
popd >/dev/null
ok "Helm chart installed/updated with default operator image"
fi
fi
# 7) Install benchmark dependencies if requirements.txt exists
REQUIREMENTS_FILE="$SCRIPT_DIR/requirements.txt"
if [[ -f "$REQUIREMENTS_FILE" ]]; then
log "Installing benchmark dependencies..."
if command -v uv >/dev/null 2>&1; then
uv pip install -r "$REQUIREMENTS_FILE"
elif command -v pip3 >/dev/null 2>&1; then
pip3 install -r "$REQUIREMENTS_FILE"
elif command -v pip >/dev/null 2>&1; then
pip install -r "$REQUIREMENTS_FILE"
else
warn "No pip/pip3/uv found; skipping benchmark dependency installation"
warn "To run benchmarks, manually install: pip install -r $REQUIREMENTS_FILE"
fi
ok "Benchmark dependencies installed"
fi
ok "Kubernetes namespace setup complete"
......@@ -37,9 +37,15 @@ The framework is a wrapper around `genai-perf` that:
## Prerequisites
1. **Kubernetes cluster with NVIDIA GPUs and Dynamo namespace setup** - You need a Kubernetes cluster with eligible NVIDIA GPUs and a properly configured namespace for Dynamo benchmarking. See the [deploy/utils/README](../../deploy/utils/README.md) for complete setup instructions.
1. **Kubernetes cluster with NVIDIA GPUs and Dynamo Cloud platform** - You need a Kubernetes cluster with eligible NVIDIA GPUs and the Dynamo Cloud platform installed. First follow the [installation guide](../../guides/dynamo_deploy/installation_guide.md) to install Dynamo Cloud, then use [deploy/utils/README](../../deploy/utils/README.md) to set up benchmarking resources.
2. **kubectl access** - You need `kubectl` installed and configured to access your Kubernetes cluster. All other required tools (GenAI-Perf, Python, etc.) are included in the Dynamo containers. If you are not working within a Dynamo container, you can install the necessary requirements using `deploy/utils/requirements.txt`. *Note: if you are on Ubuntu 22.04 or lower, you will also need to build perf_analyzer [from source](https://github.com/triton-inference-server/perf_analyzer/blob/main/docs/install.md#build-from-source).*
2. **kubectl access** - You need `kubectl` installed and configured to access your Kubernetes cluster.
3. **Benchmark dependencies** - Since benchmarks run locally, you need to install the required Python dependencies. Install them using:
```bash
pip install -r deploy/utils/requirements.txt
```
*Note: if you are on Ubuntu 22.04 or lower, you will also need to build perf_analyzer [from source](https://github.com/triton-inference-server/perf_analyzer/blob/main/docs/install.md#build-from-source).*
## Quick Start Examples
......
......@@ -89,7 +89,7 @@ SLA planner can work with any interpolation data that follows the above format.
## Running the Profiling Script in Kubernetes
Set up your Kubernetes namespace (one-time per namespace). Follow the instructions [here](../../deploy/utils/README.md#kubernetes-setup-one-time-per-namespace). If your namespace is already set up, skip this step.
Set up your Kubernetes namespace for profiling (one-time per namespace). First ensure Dynamo Cloud platform is installed by following the [main installation guide](../../deploy/README.md), then set up profiling resources using [deploy/utils/README](../../deploy/utils/README.md). If your namespace is already set up, skip this step.
**Prerequisites**: Ensure all dependencies are installed. If you ran the setup script above, dependencies are already installed. Otherwise, install them manually:
```bash
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment