index.rst 3.36 KB
Newer Older
1
..
2
    SPDX-FileCopyrightText: Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3
4
5
6
7
8
9
10
11
12
13
14
15
16
    SPDX-License-Identifier: Apache-2.0

    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
    You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.

17
18
19
20
..
   Main Page
..

21
22
23
Welcome to NVIDIA Dynamo
========================

24
The NVIDIA Dynamo Platform is a high-performance, low-latency inference framework designed to serve all AI models—across any framework, architecture, or deployment scale.
25

26
27
28
.. admonition:: 💎 Discover the latest developments!
   :class: seealso

29
   This guide is a snapshot of a specific point in time. For the latest information, examples, and Release Assets, see the `Dynamo GitHub repository <https://github.com/ai-dynamo/dynamo/releases/latest>`_.
30

31
32
33
Quickstart
==========
.. include:: _includes/quick_start_local.rst
34

35
36
37
..
   Sidebar
..
38
39
40

.. toctree::
   :hidden:
41
   :caption: Getting Started
42

43
   Quickstart <self>
44
   Support Matrix <reference/support-matrix.md>
45
46
   Feature Matrix <reference/feature-matrix.md>
   Release Artifacts <reference/release-artifacts.md>
47
   Examples <_sections/examples>
48
49
50

.. toctree::
   :hidden:
51
   :caption: Kubernetes Deployment
52

53
54
55
   Deployment Guide <kubernetes/README>
   Observability (K8s) <kubernetes/observability/metrics>
   Multinode <kubernetes/deployment/multinode-deployment>
56
57
58
59
60

.. toctree::
   :hidden:
   :caption: User Guides

61
62
   KV Cache Aware Routing <components/router/router_guide.md>
   Disaggregated Serving Guide <features/disaggregated_serving/README.md>
63
   KV Cache Offloading <components/kvbm/kvbm_guide.md>
64
   Benchmarking <benchmarks/benchmarking.md>
65
   Multimodality Support <features/multimodal/README.md>
66
   Tool Calling <agents/tool-calling.md>
67
   LoRA Adapters <features/lora/README.md>
68
69
   Observability (Local) <observability/README>
   Fault Tolerance <fault_tolerance/README>
70
   Writing Python Workers in Dynamo <development/backend-guide.md>
71
72
73

.. toctree::
   :hidden:
74
   :caption: Components
75

76
   Backends <_sections/backends>
77
78
79
   Frontend <components/frontend/README>
   Router <components/router/README>
   Planner <components/planner/README>
80
   Profiler <components/profiler/README>
81
   KVBM <components/kvbm/README>
82

83
84
85
86
87
88
89
90
91
.. toctree::
   :hidden:
   :caption: Integrations

   LMCache <integrations/lmcache_integration.md>
   SGLang HiCache <integrations/sglang_hicache.md>
   FlexKV <integrations/flexkv_integration.md>
   KV Events for Custom Engines <integrations/kv_events_custom_engines.md>

92
93
.. toctree::
   :hidden:
94
95
96
97
98
99
   :caption: Design Docs

   Overall Architecture <design_docs/architecture.md>
   Architecture Flow <design_docs/dynamo_flow.md>
   Disaggregated Serving <design_docs/disagg_serving.md>
   Distributed Runtime <design_docs/distributed_runtime.md>
100
101
   Request Plane <design_docs/request_plane.md>
   Event Plane <design_docs/event_plane.md>
102
103
   Router Design <design_docs/router_design.md>
   KVBM Design <design_docs/kvbm_design.md>
104
   Planner Design <design_docs/planner_design.md>