"vscode:/vscode.git/clone" did not exist on "20ad730cfde470de79a59eae6ed20938a23ace3c"
Unverified Commit 6901c7c0 authored by Anant Sharma's avatar Anant Sharma Committed by GitHub
Browse files

docs: merge changes from 0.3.1 release (#1543) (#1759)


Co-authored-by: default avatarKristen Kelleher <kkelleher@nvidia.com>
parent 2f38e10f
......@@ -62,19 +62,6 @@ Dive in: Examples
Presents TensorRT-LLM examples and reference implementations for deploying Large Language Models (LLMs) in various configurations.
Overview
--------
Dynamo is inference engine agnostic, supporting TRT-LLM, vLLM, SGLang, and others, and captures LLM-specific capabilities such as:
* **Disaggregated prefill & decode inference** - Maximizes GPU throughput and facilitates trade off between throughput and latency.
* **Dynamic GPU scheduling** - Optimizes performance based on fluctuating demand.
* **LLM-aware request routing** - Eliminates unnecessary KV cache re-computation.
* **Accelerated data transfer** - Reduces inference response time using NIXL.
* **KV cache offloading** - Leverages several memory hierarchies for higher system throughput.
Built in Rust for performance and in Python for extensibility, Dynamo is fully open-source
and is driven by a transparent development approach. Check out our repo at https://github.com/ai-dynamo/.
.. toctree::
:hidden:
......@@ -120,6 +107,7 @@ and is driven by a transparent development approach. Check out our repo at https
Dynamo Cloud Kubernetes Platform <guides/dynamo_deploy/dynamo_cloud.md>
Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform <guides/dynamo_deploy/operator_deployment.md>
Manual Helm Deployment <guides/dynamo_deploy/manual_helm_deployment.md>
GKE Setup Guide <guides/dynamo_deploy/gke_setup.md>
Minikube Setup Guide <guides/dynamo_deploy/minikube.md>
Model Caching with Fluid <guides/dynamo_deploy/model_caching_with_fluid.md>
......@@ -127,7 +115,7 @@ and is driven by a transparent development approach. Check out our repo at https
:hidden:
:caption: Benchmarking
Planner Benchmark Example <guides/planner_benchmark/benchmark_planner.md>
Planner Benchmark Example <guides/planner_benchmark/README.md>
.. toctree::
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment