@@ -85,7 +85,7 @@ dynamo build hello_world:Frontend --containerize
...
@@ -85,7 +85,7 @@ dynamo build hello_world:Frontend --containerize
### 4. Run your container
### 4. Run your container
As a prerequisite, ensure you have NATS and etcd running by running the docker compose in the deploy directory. You can find it [here](../../deploy/docker-compose.yml).
As a prerequisite, ensure you have NATS and etcd running by running the docker compose in the deploy directory. You can find it [here](../../deploy/metrics/docker-compose.yml).
```bash
```bash
docker compose up -d
docker compose up -d
...
@@ -145,7 +145,7 @@ dynamo build graphs.agg:Frontend --containerize
...
@@ -145,7 +145,7 @@ dynamo build graphs.agg:Frontend --containerize
### 4. Run your container
### 4. Run your container
As a prerequisite, ensure you have NATS and etcd running by running the docker compose in the deploy directory. You can find it [here](../../deploy/docker-compose.yml).
As a prerequisite, ensure you have NATS and etcd running by running the docker compose in the deploy directory. You can find it [here](../../deploy/metrics/docker-compose.yml).
@@ -21,7 +21,7 @@ This guide will walk you through the process of deploying an inference graph cre
...
@@ -21,7 +21,7 @@ This guide will walk you through the process of deploying an inference graph cre
While this guide covers deployment of Dynamo inference graphs using Helm, the preferred method to deploy an inference graph is to [deploy with the Dynamo cloud platform](operator_deployment.md). The [Dynamo cloud platform](dynamo_cloud.md) simplifies the deployment and management of Dynamo inference graphs. It includes a set of components (Operator, Kubernetes Custom Resources, etc.) that work together to streamline the deployment and management process.
While this guide covers deployment of Dynamo inference graphs using Helm, the preferred method to deploy an inference graph is to [deploy with the Dynamo cloud platform](operator_deployment.md). The [Dynamo cloud platform](dynamo_cloud.md) simplifies the deployment and management of Dynamo inference graphs. It includes a set of components (Operator, Kubernetes Custom Resources, etc.) that work together to streamline the deployment and management process.
Once an inference graph is defined using the Dynamo SDK, it can be deployed onto a Kubernetes cluster using a simple `dynamo deploy` command that orchestrates the following deployment steps:
Once an inference graph is defined using the Dynamo SDK, it can be deployed onto a Kubernetes cluster using a simple `dynamo deploy` command that orchestrates the following deployment steps:
1. Building docker images from inference graph components on the cluster
1. Building docker images from inference graph components on the cluster
2. Intelligently composing the encoded inference graph into a complete deployment on Kubernetes
2. Intelligently composing the encoded inference graph into a complete deployment on Kubernetes
@@ -13,7 +13,7 @@ Before proceeding with deployment, ensure you have:
...
@@ -13,7 +13,7 @@ Before proceeding with deployment, ensure you have:
- Helm package manager
- Helm package manager
- Rust packages and toolchain
- Rust packages and toolchain
You must have first followed the instructions in [deploy/dynamo/helm/README.md](../../../deploy/dynamo/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
You must have first followed the instructions in [deploy/cloud/helm/README.md](../../../deploy/cloud/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
**Note**: Note the `KUBE_NS` variable in the following steps must match the Kubernetes namespace where you installed Dynamo Cloud. You must also expose the `dynamo-store` service externally. This will be the endpoint the CLI uses to interface with Dynamo Cloud.
**Note**: Note the `KUBE_NS` variable in the following steps must match the Kubernetes namespace where you installed Dynamo Cloud. You must also expose the `dynamo-store` service externally. This will be the endpoint the CLI uses to interface with Dynamo Cloud.
@@ -37,7 +37,7 @@ Inference graphs are compositions of service components that work together to ha
...
@@ -37,7 +37,7 @@ Inference graphs are compositions of service components that work together to ha
## Creating an inference graph
## Creating an inference graph
Once you've written your various Dynamo services (docs on how to write these can be found [here](../../deploy/dynamo/sdk/docs/sdk/README.md)), you can create an inference graph by composing these services together using the following two mechanisms:
Once you've written your various Dynamo services (docs on how to write these can be found [here](../../deploy/sdk/docs/sdk/README.md)), you can create an inference graph by composing these services together using the following two mechanisms:
### 1. Dependencies with `depends()`
### 1. Dependencies with `depends()`
...
@@ -144,7 +144,7 @@ We've provided a set of basic configurations for this example [here](../../examp
...
@@ -144,7 +144,7 @@ We've provided a set of basic configurations for this example [here](../../examp
### 4. Serve your graph
### 4. Serve your graph
As a prerequisite, ensure you have NATS and etcd running by running the docker compose in the deploy directory. You can find it [here](../../deploy/docker-compose.yml).
As a prerequisite, ensure you have NATS and etcd running by running the docker compose in the deploy directory. You can find it [here](../../deploy/metrics/docker-compose.yml).
@@ -93,7 +93,7 @@ This example can be deployed to a Kubernetes cluster using [Dynamo Cloud](../../
...
@@ -93,7 +93,7 @@ This example can be deployed to a Kubernetes cluster using [Dynamo Cloud](../../
### Prerequisites
### Prerequisites
You must have first followed the instructions in [deploy/dynamo/helm/README.md](../../deploy/dynamo/helm/README.md) to create your Dynamo cloud deployment.
You must have first followed the instructions in [deploy/cloud/helm/README.md](../../deploy/cloud/helm/README.md) to create your Dynamo cloud deployment.
Start required services (etcd and NATS) using [Docker Compose](../../deploy/docker-compose.yml)
Start required services (etcd and NATS) using [Docker Compose](../../deploy/metrics/docker-compose.yml)
```bash
```bash
docker compose -f deploy/docker-compose.yml up -d
docker compose -f deploy/metrics/docker-compose.yml up -d
```
```
### Build docker
### Build docker
...
@@ -186,7 +186,7 @@ These examples can be deployed to a Kubernetes cluster using [Dynamo Cloud](../.
...
@@ -186,7 +186,7 @@ These examples can be deployed to a Kubernetes cluster using [Dynamo Cloud](../.
### Prerequisites
### Prerequisites
You must have first followed the instructions in [deploy/dynamo/helm/README.md](../../deploy/dynamo/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
You must have first followed the instructions in [deploy/cloud/helm/README.md](../../deploy/cloud/helm/README.md) to install Dynamo Cloud on your Kubernetes cluster.
**Note**: The `KUBE_NS` variable in the following steps must match the Kubernetes namespace where you installed Dynamo Cloud. You must also expose the `dynamo-store` service externally. This will be the endpoint the CLI uses to interface with Dynamo Cloud.
**Note**: The `KUBE_NS` variable in the following steps must match the Kubernetes namespace where you installed Dynamo Cloud. You must also expose the `dynamo-store` service externally. This will be the endpoint the CLI uses to interface with Dynamo Cloud.
@@ -17,7 +17,7 @@ Note that this can be easily extended to more nodes. You can also run the Fronte
...
@@ -17,7 +17,7 @@ Note that this can be easily extended to more nodes. You can also run the Fronte
**Step 1**: Start NATS/ETCD on your head node. Ensure you have the correct firewall rules to allow communication between the nodes as you will need the NATS/ETCD endpoints to be accessible by all other nodes.
**Step 1**: Start NATS/ETCD on your head node. Ensure you have the correct firewall rules to allow communication between the nodes as you will need the NATS/ETCD endpoints to be accessible by all other nodes.
```bash
```bash
# node 1
# node 1
docker compose -f deploy/docker-compose.yml up -d
docker compose -f deploy/metrics/docker-compose.yml up -d
```
```
**Step 2**: Create the inference graph for this node. Here we will use the `agg_router.py` (even though we are doing disaggregated serving) graph because we want the `Frontend`, `Processor`, `Router`, and `VllmWorker` to spin up (we will spin up the other decode worker and prefill worker separately on different nodes later).
**Step 2**: Create the inference graph for this node. Here we will use the `agg_router.py` (even though we are doing disaggregated serving) graph because we want the `Frontend`, `Processor`, `Router`, and `VllmWorker` to spin up (we will spin up the other decode worker and prefill worker separately on different nodes later).