Unverified Commit 403344e5 authored by hhzhang16's avatar hhzhang16 Committed by GitHub
Browse files

refactor: refactor dynamo deploy subfolder (#927)

parent 99cd9d85
......@@ -40,7 +40,7 @@ def setup_and_teardown():
"serve",
"pipeline:Frontend",
"--working-dir",
"deploy/dynamo/sdk/src/dynamo/sdk/tests",
"deploy/sdk/src/dynamo/sdk/tests",
"--Frontend.model=qwentastic",
"--Middle.bias=0.5",
"--dry-run",
......@@ -54,7 +54,7 @@ def setup_and_teardown():
"serve",
"pipeline:Frontend",
"--working-dir",
"deploy/dynamo/sdk/src/dynamo/sdk/tests",
"deploy/sdk/src/dynamo/sdk/tests",
"--Frontend.model=qwentastic",
"--Middle.bias=0.5",
]
......
......@@ -5,7 +5,7 @@ it via `dynamo serve` or `dynamo deploy`, covering basic concepts as well as
advanced features like enabling KV routing and disaggregated serving.
For detailed information about `dynamo serve` infrastructure, see the
[Dynamo SDK Docs](../deploy/dynamo/sdk/docs/sdk/README.md).
[Dynamo SDK Docs](../deploy/sdk/docs/sdk/README.md).
For a guide that walks through how to launch a vLLM-based worker with
implementation of Disaggregated Serving and KV-Aware Routing included,
......@@ -19,7 +19,7 @@ a Python class based definition that requires a few key decorators to get going:
- `@dynamo_endpoint`: marks methods that can be called by other workers or clients
For more detailed information on these concepts, see the
[Dynamo SDK Docs](../deploy/dynamo/sdk/docs/sdk/README.md).
[Dynamo SDK Docs](../deploy/sdk/docs/sdk/README.md).
### Worker Skeleton
......@@ -52,7 +52,7 @@ based on the definitions above, it would be: `your_namespace/YourWorker/your_end
- `endpoint="your_endpoint"`: Defined by the `@dynamo_endpoint` decorator, or by default the name of the function being decorated.
For more details about service configuration, resource management, and dynamo endpoints,
see the [Dynamo SDK Docs](../deploy/dynamo/sdk/docs/README.md).
see the [Dynamo SDK Docs](../deploy/sdk/docs/README.md).
### Request/Response Types
......@@ -628,5 +628,5 @@ For more information on Disaggregated Serving, see the
## Additional Resources
- Check the [examples](../examples/) directory for more detailed implementations
- Refer to the [Dynamo SDK Docs](../deploy/dynamo/sdk/docs/sdk/README.md) for API details.
- Refer to the [Dynamo SDK Docs](../deploy/sdk/docs/sdk/README.md) for API details.
- For Disaggregated Serving, see the [general guide](../docs/disagg_serving.md) and [performance tuning guide](../docs/guides/disagg_perf_tuning.md).
......@@ -85,7 +85,7 @@ dynamo build hello_world:Frontend --containerize
### 4. Run your container
As a prerequisite, ensure you have NATS and etcd running by running the docker compose in the deploy directory. You can find it [here](../../deploy/docker-compose.yml).
As a prerequisite, ensure you have NATS and etcd running by running the docker compose in the deploy directory. You can find it [here](../../deploy/metrics/docker-compose.yml).
```bash
docker compose up -d
......@@ -145,7 +145,7 @@ dynamo build graphs.agg:Frontend --containerize
### 4. Run your container
As a prerequisite, ensure you have NATS and etcd running by running the docker compose in the deploy directory. You can find it [here](../../deploy/docker-compose.yml).
As a prerequisite, ensure you have NATS and etcd running by running the docker compose in the deploy directory. You can find it [here](../../deploy/metrics/docker-compose.yml).
```bash
docker compose up -d
......
......@@ -25,7 +25,7 @@ Dynamo provides two distinct deployment paths, each serving different use cases:
### 1. 🚀 Dynamo Cloud Kubernetes Platform [PREFERRED]
The Dynamo Cloud Platform (`deploy/dynamo/helm/`) provides a managed deployment experience:
The Dynamo Cloud Platform (`deploy/cloud/`) provides a managed deployment experience:
- Contains the infrastructure components required for the Dynamo cloud platform
- Used when deploying with the `dynamo deploy` CLI commands
......@@ -37,15 +37,15 @@ For detailed instructions on using the Dynamo Cloud Platform, see:
### 2. Manual Deployment with Helm Charts
The manual deployment path (`deploy/Kubernetes/`) is available for users who need more control over their deployments:
The manual deployment path (`deploy/helm/`) is available for users who need more control over their deployments:
- Used for manually deploying inference graphs to Kubernetes
- Contains Helm charts and configurations for deploying individual inference pipelines
- Provides full control over deployment parameters
- Requires manual management of infrastructure components
- Documentation:
- [Deploying Dynamo Inference Graphs to Kubernetes using Helm](../../Kubernetes/pipeline/README.md): all-in-one script
- [Manual Helm Deployment Guide](manual_helm_deployment.md): detailed instructions on manual deployment
- [Using the Deployment Script](manual_helm_deployment.md#using-the-deployment-script): all-in-one script for manual deployment
- [Helm Deployment Guide](manual_helm_deployment.md#helm-deployment-guide): detailed instructions for manual deployment
## Getting Started
......
......@@ -141,7 +141,7 @@ Running the installation script with `--interactive` will guide you through the
2. [One-time Action] Create a new kubernetes namespace and set it as your default.
```bash
cd deploy/dynamo/helm
cd deploy/cloud/helm
kubectl create namespace $NAMESPACE
kubectl config set-context --current --namespace=$NAMESPACE
```
......
......@@ -21,7 +21,7 @@ This guide will walk you through the process of deploying an inference graph cre
While this guide covers deployment of Dynamo inference graphs using Helm, the preferred method to deploy an inference graph is to [deploy with the Dynamo cloud platform](operator_deployment.md). The [Dynamo cloud platform](dynamo_cloud.md) simplifies the deployment and management of Dynamo inference graphs. It includes a set of components (Operator, Kubernetes Custom Resources, etc.) that work together to streamline the deployment and management process.
Once an inference graph is defined using the Dynamo SDK, it can be deployed onto a Kubernetes cluster using a simple `dynamo deploy` command that orchestrates the following deployment steps:
Once an inference graph is defined using the Dynamo SDK, it can be deployed onto a Kubernetes cluster using a simple `dynamo deploy` command that orchestrates the following deployment steps:
1. Building docker images from inference graph components on the cluster
2. Intelligently composing the encoded inference graph into a complete deployment on Kubernetes
......@@ -86,7 +86,7 @@ export PROJECT_ROOT=$(pwd)
2. Install NATS messaging system:
```bash
# Navigate to dependencies directory
cd $PROJECT_ROOT/deploy/Kubernetes/pipeline/dependencies
cd $PROJECT_ROOT/deploy/helm/dependencies
# Add and update NATS Helm repository
helm repo add nats https://nats-io.github.io/k8s/helm/charts/
......@@ -139,7 +139,7 @@ docker push <TAG>
3. Deploy using Helm:
```bash
# Navigate to the deployment directory
cd $PROJECT_ROOT/deploy/Kubernetes/pipeline
cd $PROJECT_ROOT/deploy/helm
# Set release name for Helm
export HELM_RELEASE=hello-world-manual
......@@ -167,4 +167,21 @@ curl -X 'POST' 'http://localhost:3000/generate' \
-d '{"text": "test"}'
```
For convenience, you can find a complete deployment script at `deploy/Kubernetes/pipeline/deploy.sh` that automates all of these steps.
### Using the Deployment Script
For convenience, you can use the deployment script at `deploy/helm/deploy.sh` that automates all of these steps:
```bash
export DYNAMO_IMAGE=<dynamo_docker_image_name>
./deploy.sh <docker_registry> <k8s_namespace> <path_to_dynamo_directory> <dynamo_identifier> [<dynamo_config_file>]
# Example: export DYNAMO_IMAGE=nvcr.io/nvidian/nim-llm-dev/dynamo-base-worker:0.0.1
# Example: ./deploy.sh nvcr.io/nvidian/nim-llm-dev my-namespace ../../../examples/hello_world/ hello_world:Frontend
# Example: ./deploy.sh nvcr.io/nvidian/nim-llm-dev my-namespace ../../../examples/llm graphs.disagg_router:Frontend ../../../examples/llm/configs/disagg_router.yaml
```
This script handles:
1. Building and pushing the Docker image
2. Setting up the Helm values
3. Installing/upgrading the Helm release
4. Configuring the necessary Kubernetes resources
......@@ -13,7 +13,7 @@ Before proceeding with deployment, ensure you have:
- Helm package manager
- Rust packages and toolchain
You must have first followed the instructions in [deploy/dynamo/helm/README.md](../../../deploy/dynamo/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
You must have first followed the instructions in [deploy/cloud/helm/README.md](../../../deploy/cloud/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
**Note**: Note the `KUBE_NS` variable in the following steps must match the Kubernetes namespace where you installed Dynamo Cloud. You must also expose the `dynamo-store` service externally. This will be the endpoint the CLI uses to interface with Dynamo Cloud.
## Understanding the Deployment Process
......
......@@ -37,7 +37,7 @@ Inference graphs are compositions of service components that work together to ha
## Creating an inference graph
Once you've written your various Dynamo services (docs on how to write these can be found [here](../../deploy/dynamo/sdk/docs/sdk/README.md)), you can create an inference graph by composing these services together using the following two mechanisms:
Once you've written your various Dynamo services (docs on how to write these can be found [here](../../deploy/sdk/docs/sdk/README.md)), you can create an inference graph by composing these services together using the following two mechanisms:
### 1. Dependencies with `depends()`
......@@ -144,7 +144,7 @@ We've provided a set of basic configurations for this example [here](../../examp
### 4. Serve your graph
As a prerequisite, ensure you have NATS and etcd running by running the docker compose in the deploy directory. You can find it [here](../../deploy/docker-compose.yml).
As a prerequisite, ensure you have NATS and etcd running by running the docker compose in the deploy directory. You can find it [here](../../deploy/metrics/docker-compose.yml).
```bash
docker compose up -d
......
......@@ -93,7 +93,7 @@ This example can be deployed to a Kubernetes cluster using [Dynamo Cloud](../../
### Prerequisites
You must have first followed the instructions in [deploy/dynamo/helm/README.md](../../deploy/dynamo/helm/README.md) to create your Dynamo cloud deployment.
You must have first followed the instructions in [deploy/cloud/helm/README.md](../../deploy/cloud/helm/README.md) to create your Dynamo cloud deployment.
### Deployment Steps
......
......@@ -45,9 +45,9 @@ In this example, we will use 2 nodes to demo the disagg serving.
- Deploys DummyWorker as the monolith worker
### Prerequisites
On Node 1, start required services (etcd and NATS) using [Docker Compose](../../../deploy/docker-compose.yml)
On Node 1, start required services (etcd and NATS) using [Docker Compose](../../../deploy/metrics/docker-compose.yml)
```bash
docker compose -f deploy/docker-compose.yml up -d
docker compose -f deploy/metrics/docker-compose.yml up -d
```
### Run the Deployment
......
......@@ -64,9 +64,9 @@ sequenceDiagram
### Prerequisites
Start required services (etcd and NATS) using [Docker Compose](../../deploy/docker-compose.yml)
Start required services (etcd and NATS) using [Docker Compose](../../deploy/metrics/docker-compose.yml)
```bash
docker compose -f deploy/docker-compose.yml up -d
docker compose -f deploy/metrics/docker-compose.yml up -d
```
### Build docker
......@@ -186,7 +186,7 @@ These examples can be deployed to a Kubernetes cluster using [Dynamo Cloud](../.
### Prerequisites
You must have first followed the instructions in [deploy/dynamo/helm/README.md](../../deploy/dynamo/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
You must have first followed the instructions in [deploy/cloud/helm/README.md](../../deploy/cloud/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
**Note**: The `KUBE_NS` variable in the following steps must match the Kubernetes namespace where you installed Dynamo Cloud. You must also expose the `dynamo-store` service externally. This will be the endpoint the CLI uses to interface with Dynamo Cloud.
......
......@@ -17,7 +17,7 @@ Note that this can be easily extended to more nodes. You can also run the Fronte
**Step 1**: Start NATS/ETCD on your head node. Ensure you have the correct firewall rules to allow communication between the nodes as you will need the NATS/ETCD endpoints to be accessible by all other nodes.
```bash
# node 1
docker compose -f deploy/docker-compose.yml up -d
docker compose -f deploy/metrics/docker-compose.yml up -d
```
**Step 2**: Create the inference graph for this node. Here we will use the `agg_router.py` (even though we are doing disaggregated serving) graph because we want the `Frontend`, `Processor`, `Router`, and `VllmWorker` to spin up (we will spin up the other decode worker and prefill worker separately on different nodes later).
......
......@@ -35,9 +35,9 @@ Note: TensorRT-LLM disaggregation does not support conditional disaggregation ye
### Prerequisites
Start required services (etcd and NATS) using [Docker Compose](../../deploy/docker-compose.yml)
Start required services (etcd and NATS) using [Docker Compose](../../deploy/metrics/docker-compose.yml)
```bash
docker compose -f deploy/docker-compose.yml up -d
docker compose -f deploy/metrics/docker-compose.yml up -d
```
### Build docker
......
......@@ -44,7 +44,7 @@ cargo test
The simplest way to deploy the pre-requisite services is using
[docker-compose](https://docs.docker.com/compose/install/linux/),
defined in [deploy/docker-compose.yml](../../deploy/docker-compose.yml).
defined in [deploy/metrics/docker-compose.yml](../../deploy/metrics/docker-compose.yml).
```
docker-compose up -d
......
......@@ -44,7 +44,7 @@ cargo test
The simplest way to deploy the pre-requisite services is using
[docker-compose](https://docs.docker.com/compose/install/linux/),
defined in the project's root [docker-compose.yml](docker-compose.yml).
defined in the project's root [docker-compose.yml](../../../docker-compose.yml).
```
docker-compose up -d
......
......@@ -78,7 +78,7 @@ requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
packages = ["deploy/dynamo/sdk/src/dynamo", "components/planner/src/dynamo"]
packages = ["deploy/sdk/src/dynamo", "components/planner/src/dynamo"]
# This section is for including the binaries in the wheel package
# but doesn't make them executable scripts in the venv bin directory
......@@ -132,7 +132,7 @@ addopts = [
"--mypy",
"--ignore-glob=*model.py",
"--ignore-glob=*_inc.py",
"--ignore-glob=deploy/dynamo/api-store/*",
"--ignore-glob=deploy/cloud/api-store/*",
# FIXME: Get relative/generic blob paths to work here
]
xfail_strict = true
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment