@@ -97,7 +97,7 @@ Client-side benchmarking runs on your local machine and connects to Kubernetes d
...
@@ -97,7 +97,7 @@ Client-side benchmarking runs on your local machine and connects to Kubernetes d
Follow these steps to benchmark Dynamo deployments using client-side benchmarking:
Follow these steps to benchmark Dynamo deployments using client-side benchmarking:
### Step 1: Establish Kubernetes Cluster and Install Dynamo
### Step 1: Establish Kubernetes Cluster and Install Dynamo
Set up your Kubernetes cluster with NVIDIA GPUs and install the Dynamo Cloud platform. First follow the [installation guide](/docs/kubernetes/installation_guide.md) to install Dynamo Cloud, then use [deploy/utils/README](../../deploy/utils/README.md) to set up benchmarking resources.
Set up your Kubernetes cluster with NVIDIA GPUs and install the Dynamo Kubernetes Platform. First follow the [installation guide](/docs/kubernetes/installation_guide.md) to install Dynamo Kubernetes Platform, then use [deploy/utils/README](../../deploy/utils/README.md) to set up benchmarking resources.
### Step 2: Deploy DynamoGraphDeployments
### Step 2: Deploy DynamoGraphDeployments
Deploy your DynamoGraphDeployments separately using the [deployment documentation](../../examples/backends/). Each deployment should have a frontend service exposed.
Deploy your DynamoGraphDeployments separately using the [deployment documentation](../../examples/backends/). Each deployment should have a frontend service exposed.
...
@@ -325,7 +325,7 @@ The server-side benchmarking solution:
...
@@ -325,7 +325,7 @@ The server-side benchmarking solution:
## Prerequisites
## Prerequisites
1.**Kubernetes cluster** with NVIDIA GPUs and Dynamo namespace setup (see [Dynamo Cloud/Platform docs](/docs/kubernetes/README.md))
1.**Kubernetes cluster** with NVIDIA GPUs and Dynamo namespace setup (see [Dynamo Kubernetes Platform docs](/docs/kubernetes/README.md))
2.**Storage** PersistentVolumeClaim configured with appropriate permissions (see [deploy/utils README](../../deploy/utils/README.md))
2.**Storage** PersistentVolumeClaim configured with appropriate permissions (see [deploy/utils README](../../deploy/utils/README.md))
3.**Docker image** containing the Dynamo benchmarking tools
3.**Docker image** containing the Dynamo benchmarking tools
**Dynamo Namespace**: The logical namespace used by Dynamo components for service discovery via etcd.
**Dynamo Namespace**: The logical namespace used by Dynamo components for service discovery via etcd.
- Used for: Runtime component communication, service discovery
- Used for: Runtime component communication, service discovery
...
@@ -34,7 +34,7 @@ These are independent. A single Kubernetes namespace can host multiple Dynamo na
...
@@ -34,7 +34,7 @@ These are independent. A single Kubernetes namespace can host multiple Dynamo na
## Pre-deployment Checks
## Pre-deployment Checks
Before deploying the platform, it is recommended to run the pre-deployment checks to ensure the cluster is ready for deployment. Please refer to the [pre-deployment checks](../../deploy/cloud/pre-deployment/README.md) for more details.
Before deploying the platform, it is recommended to run the pre-deployment checks to ensure the cluster is ready for deployment. Please refer to the [pre-deployment checks](../../deploy/pre-deployment/README.md) for more details.
@@ -1083,17 +1083,17 @@ Default container ports are configured based on component type:
...
@@ -1083,17 +1083,17 @@ Default container ports are configured based on component type:
For users who want to understand the implementation details or contribute to the operator, the default values described in this document are set in the following source files:
For users who want to understand the implementation details or contribute to the operator, the default values described in this document are set in the following source files:
-**Health Probes, Security Context & Pod Specifications**: [`internal/dynamo/graph.go`](https://github.com/ai-dynamo/dynamo/blob/main/deploy/cloud/operator/internal/dynamo/graph.go) - Contains the main logic for applying default probes, security context, environment variables, shared memory, and pod configurations
-**Health Probes, Security Context & Pod Specifications**: [`internal/dynamo/graph.go`](https://github.com/ai-dynamo/dynamo/blob/main/deploy/operator/internal/dynamo/graph.go) - Contains the main logic for applying default probes, security context, environment variables, shared memory, and pod configurations
@@ -97,4 +97,4 @@ For practical examples of Grove-based multinode deployments in action, see the [
...
@@ -97,4 +97,4 @@ For practical examples of Grove-based multinode deployments in action, see the [
For the latest updates on Grove, refer to the [official project on GitHub](https://github.com/NVIDIA/grove).
For the latest updates on Grove, refer to the [official project on GitHub](https://github.com/NVIDIA/grove).
Dynamo Cloud also allows you to install Grove and KAI Scheduler as part of the platform installation. See the [Dynamo Cloud Deployment Installation Guide](./installation_guide.md) for more details.
Dynamo Kubernetes Platform also allows you to install Grove and KAI Scheduler as part of the platform installation. See the [Dynamo Kubernetes Platform Deployment Installation Guide](./installation_guide.md) for more details.
**Dynamo** - NVIDIA's high-performance distributed inference framework for Large Language Models (LLMs) and generative AI models, designed for multinode environments with disaggregated serving and cache-aware routing.
**Dynamo** - NVIDIA's high-performance distributed inference framework for Large Language Models (LLMs) and generative AI models, designed for multinode environments with disaggregated serving and cache-aware routing.
**Dynamo Cloud** - A Kubernetes platform providing managed deployment experience for Dynamo inference graphs.
**Dynamo Kubernetes Platform** - A Kubernetes platform providing managed deployment experience for Dynamo inference graphs.
## E
## E
**Endpoint** - A specific network-accessible API within a Dynamo component, such as `generate` or `load_metrics`.
**Endpoint** - A specific network-accessible API within a Dynamo component, such as `generate` or `load_metrics`.