Commit b4d56a57 authored by Dmitry Tokarev's avatar Dmitry Tokarev Committed by GitHub
Browse files

chore: Renamed Triton Distributed to Dynamo (#56)

parent dd7646ef
......@@ -25,20 +25,20 @@ Contributions intended to add significant new functionality must
follow a more collaborative path described in the following
points. Before submitting a large PR that adds a major enhancement or
extension, be sure to submit a GitHub issue that describes the
proposed change so that the Triton team can provide feedback.
proposed change so that the Dynamo team can provide feedback.
- As part of the GitHub issue discussion, a design for your change
will be agreed upon. An up-front design discussion is required to
ensure that your enhancement is done in a manner that is consistent
with Triton Distributed's overall architecture.
with Dynamo's overall architecture.
- The Triton Distributed project is spread across multiple GitHub Repositories.
The Triton team will provide guidance about how and where your enhancement
- The Dynamo project is spread across multiple GitHub Repositories.
The Dynamo team will provide guidance about how and where your enhancement
should be implemented.
- Testing is a critical part of any Triton
- Testing is a critical part of any Dynamo
enhancement. You should plan on spending significant time on
creating tests for your change. The Triton team will help you to
creating tests for your change. The Dynamo team will help you to
design your testing so that it is compatible with existing testing
infrastructure.
......@@ -75,7 +75,7 @@ proposed change so that the Triton team can provide feedback.
- Make sure all tests pass.
- Triton Distributed's default build assumes recent versions of
- Dynamo's default build assumes recent versions of
dependencies (CUDA, TensorFlow, PyTorch, TensorRT,
etc.). Contributions that add compatibility with older versions of
those dependencies will be considered, but NVIDIA cannot guarantee
......@@ -85,7 +85,7 @@ proposed change so that the Triton team can provide feedback.
- Make sure that you can contribute your work to open source (no
license and/or patent conflict is introduced by your code).
You must certify compliance with the
[license terms](https://github.com/triton-inference-server/triton-distributed/blob/main/LICENSE)
[license terms](https://github.com/ai-dynamo/dynamo/blob/main/LICENSE)
and sign off on the [Developer Certificate of Origin (DCO)](https://developercertificate.org)
described below before your pull request (PR) can be merged.
......@@ -96,7 +96,7 @@ proposed change so that the Triton team can provide feedback.
All pull requests are checked against the
[pre-commit hooks](https://github.com/pre-commit/pre-commit-hooks)
located [in the repository's top-level .pre-commit-config.yaml](https://github.com/triton-inference-server/triton-distributed/blob/main/.pre-commit-config.yaml).
located [in the repository's top-level .pre-commit-config.yaml](https://github.com/ai-dynamo/dynamo/blob/main/.pre-commit-config.yaml).
The hooks do some sanity checking like linting and formatting.
These checks must pass to merge a change.
......@@ -123,7 +123,7 @@ Also you can use vscode extension [GitHub Local Actions](https://marketplace.vis
# Developer Certificate of Origin
Triton Distributed is an open source product released under
Dynamo is an open source product released under
the Apache 2.0 license (see either
[the Apache site](https://www.apache.org/licenses/LICENSE-2.0) or
the [LICENSE file](./LICENSE)). The Apache 2.0 license allows you
......@@ -177,7 +177,7 @@ By making a contribution to this project, I certify that:
this project or the open source license(s) involved.
```
We require that every contribution to Triton Distributed is signed with
We require that every contribution to Dynamo is signed with
a Developer Certificate of Origin. Additionally, please use your real name.
We do not accept anonymous contributors nor those utilizing pseudonyms.
......
......@@ -63,7 +63,7 @@ TENSORRTLLM_BASE_IMAGE_TAG=${TENSORRTLLM_BASE_VERSION}-trtllm-python-py3
# used in the base image above.
TENSORRTLLM_BACKEND_REPO_TAG=triton-llm/v0.17.0
# Set this as 1 to rebuild and replace trtllm backend bits in the container.
# This will allow building triton distributed container image with custom
# This will allow building Dynamo container image with custom
# trt-llm backend repo branch.
TENSORRTLLM_BACKEND_REBUILD=0
# Set this as 1 to skip cloning the trt-llm backend repo. If cloning is skipped, trt-llm
......@@ -247,7 +247,7 @@ get_options() {
fi
if [ -z "$TAG" ]; then
TAG="--tag triton-distributed:${VERSION}-${FRAMEWORK,,}"
TAG="--tag dynamo:${VERSION}-${FRAMEWORK,,}"
if [ ! -z ${TARGET} ]; then
TAG="${TAG}-${TARGET}"
fi
......@@ -265,7 +265,7 @@ get_options() {
show_image_options() {
echo ""
echo "Building Triton Distributed Image: '${TAG}'"
echo "Building Dynamo Image: '${TAG}'"
echo ""
echo " Base: '${BASE_IMAGE}'"
echo " Base_Image_Tag: '${BASE_IMAGE_TAG}'"
......@@ -340,7 +340,7 @@ if [ ! -z ${HF_TOKEN} ]; then
BUILD_ARGS+=" --build-arg HF_TOKEN=${HF_TOKEN} "
fi
LATEST_TAG="--tag triton-distributed:latest-${FRAMEWORK,,}"
LATEST_TAG="--tag dynamo:latest-${FRAMEWORK,,}"
if [ ! -z ${TARGET} ]; then
LATEST_TAG="${LATEST_TAG}-${TARGET}"
fi
......
......@@ -178,7 +178,7 @@ get_options() {
fi
if [ -z "$IMAGE" ]; then
IMAGE="triton-distributed:latest-${FRAMEWORK,,}"
IMAGE="dynamo:latest-${FRAMEWORK,,}"
if [ ! -z ${TARGET} ]; then
IMAGE="${IMAGE}-${TARGET}"
fi
......
......@@ -15,9 +15,9 @@ See the License for the specific language governing permissions and
limitations under the License.
-->
# TensorRT-LLM Integration with Triton Distributed
# TensorRT-LLM Integration with Dynamo
This example demonstrates how to use Triton Distributed to serve large language models with the tensorrt_llm engine, enabling efficient model serving with both monolithic and disaggregated deployment options.
This example demonstrates how to use Dynamo to serve large language models with the tensorrt_llm engine, enabling efficient model serving with both monolithic and disaggregated deployment options.
## Prerequisites
......@@ -58,7 +58,7 @@ python3 scripts/build_wheel.py --clean --trt_root /usr/local/tensorrt -a native
cp build/tensorrt_llm-*.whl /home
```
- Build the Triton Distributed container
- Build the Dynamo container
```bash
# Build image
./container/build.sh --base-image gitlab-master.nvidia.com:5005/dl/dgx/tritonserver/tensorrt-llm/amd64 --base-image-tag krish-fix-trtllm-build.23766174
......@@ -73,7 +73,7 @@ Alternatively, you can build with latest tensorrt_llm pipeline like below:
## Launching the Environment
```
# Run image interactively from with the triton distributed root directory.
# Run image interactively from with the Dynamo root directory.
./container/run.sh --framework TENSORRTLLM -it -v /home/:/home/
# Install the TRT-LLM wheel. No need to do this if you are using the latest tensorrt_llm image.
......@@ -306,7 +306,7 @@ export ETCD_ENDPOINTS="http://node1:2379,http://node2:2379"
3. Launch the workers from node1 or login node. WORLD_SIZE is similar to single node deployment.
```bash
srun --mpi pmix -N NUM_NODES --ntasks WORLD_SIZE --ntasks-per-node=WORLD_SIZE --no-container-mount-home --overlap --container-image IMAGE --output batch_%x_%j.log --err batch_%x_%j.err --container-mounts PATH_TO_TRITON_DISTRIBUTED:/workspace --container-env=NATS_SERVER,ETCD_ENDPOINTS bash -c 'cd /workspace/examples/python_rs/llm/tensorrt_llm && python3 -m disaggregated.worker --engine_args llm_api_config.yaml -c disaggregated/llmapi_disaggregated_configs/multi_node_config.yaml' &
srun --mpi pmix -N NUM_NODES --ntasks WORLD_SIZE --ntasks-per-node=WORLD_SIZE --no-container-mount-home --overlap --container-image IMAGE --output batch_%x_%j.log --err batch_%x_%j.err --container-mounts PATH_TO_DYNAMO:/workspace --container-env=NATS_SERVER,ETCD_ENDPOINTS bash -c 'cd /workspace/examples/python_rs/llm/tensorrt_llm && python3 -m disaggregated.worker --engine_args llm_api_config.yaml -c disaggregated/llmapi_disaggregated_configs/multi_node_config.yaml' &
```
Once the workers are launched, you should see the output similar to the following in the worker logs.
......@@ -323,7 +323,7 @@ Once the workers are launched, you should see the output similar to the followin
4. Launch the router from node1 or login node.
```bash
srun --mpi pmix -N 1 --ntasks 1 --ntasks-per-node=1 --overlap --container-image IMAGE --output batch_router_%x_%j.log --err batch_router_%x_%j.err --container-mounts PATH_TO_TRITON_DISTRIBUTED:/workspace --container-env=NATS_SERVER,ETCD_ENDPOINTS bash -c 'cd /workspace/examples/python_rs/llm/tensorrt_llm && python3 -m disaggregated.router' &
srun --mpi pmix -N 1 --ntasks 1 --ntasks-per-node=1 --overlap --container-image IMAGE --output batch_router_%x_%j.log --err batch_router_%x_%j.err --container-mounts PATH_TO_DYNAMO:/workspace --container-env=NATS_SERVER,ETCD_ENDPOINTS bash -c 'cd /workspace/examples/python_rs/llm/tensorrt_llm && python3 -m disaggregated.router' &
```
5. Send requests to the router.
......
......@@ -15,9 +15,9 @@ See the License for the specific language governing permissions and
limitations under the License.
-->
# vLLM Integration with Triton Distributed
# vLLM Integration with Dynamo
This example demonstrates how to use Triton Distributed to serve large language models with the vLLM engine, enabling efficient model serving with both monolithic and disaggregated deployment options.
This example demonstrates how to use Dynamo to serve large language models with the vLLM engine, enabling efficient model serving with both monolithic and disaggregated deployment options.
## Prerequisites
......@@ -38,7 +38,7 @@ Start required services (etcd and NATS):
## Building the Environment
The example is designed to run in a containerized environment using Triton Distributed, vLLM, and associated dependencies. To build the container:
The example is designed to run in a containerized environment using Dynamo, vLLM, and associated dependencies. To build the container:
```bash
# Build image
......
......@@ -15,9 +15,9 @@ See the License for the specific language governing permissions and
limitations under the License.
-->
# Triton Distributed Python Bindings
# Dynamo Python Bindings
Python bindings for the Triton distributed runtime system, enabling distributed computing capabilities for machine learning workloads.
Python bindings for the Dynamo runtime system, enabling distributed computing capabilities for machine learning workloads.
## 🚀 Quick Start
......@@ -56,7 +56,7 @@ See [README.md](/lib/runtime/README.md).
1. Start 3 separate shells, and activate the virtual environment in each
```
cd python-wheels/triton-distributed
cd python-wheels/dynamo
source .venv/bin/activate
```
......
......@@ -15,13 +15,13 @@ See the License for the specific language governing permissions and
limitations under the License.
-->
# Triton Distributed Runtime
# Dynamo Runtime
<h4>A Datacenter Scale Distributed Inference Serving Framework</h4>
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
Rust implementation of the Triton distributed runtime system, enabling distributed computing capabilities for machine learning workloads.
Rust implementation of the Dynamo runtime system, enabling distributed computing capabilities for machine learning workloads.
## 🛠️ Prerequisites
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment