index.rst 2.58 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
..
    SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
    SPDX-License-Identifier: Apache-2.0

    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
    You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.

17
18
19
20
..
   Main Page
..

21
22
23
Welcome to NVIDIA Dynamo
========================

24
The NVIDIA Dynamo Platform is a high-performance, low-latency inference framework designed to serve all AI models—across any framework, architecture, or deployment scale.
25

26
27
28
.. admonition:: 💎 Discover the latest developments!
   :class: seealso

29
   This guide is a snapshot of a specific point in time. For the latest information, examples, and Release Assets, see the `Dynamo GitHub repository <https://github.com/ai-dynamo/dynamo/releases/latest>`_.
30

31
32
33
Quickstart
==========
.. include:: _includes/quick_start_local.rst
34

35
36
37
..
   Sidebar
..
38
39
40

.. toctree::
   :hidden:
41
   :caption: Getting Started
42

43
44
   Quickstart <self>
   Installation <_sections/installation>
45
   Support Matrix <reference/support-matrix.md>
46
   Examples <_sections/examples>
47
48
49

.. toctree::
   :hidden:
50
   :caption: Kubernetes Deployment
51

52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
   Deployment Guide <_sections/k8s_deployment>
   Observability (K8s) <_sections/k8s_observability>
   Multinode <_sections/k8s_multinode>

.. toctree::
   :hidden:
   :caption: User Guides

   Tool Calling <agents/tool-calling.md>
   Multimodality Support <multimodal/multimodal_intro.md>
   Finding Best Initial Configs <performance/aiconfigurator.md>
   Benchmarking <benchmarks/benchmarking.md>
   Tuning Disaggregated Performance <performance/tuning.md>
   Writing Python Workers in Dynamo <development/backend-guide.md>
   Observability (Local) <_sections/observability>
   Glossary <reference/glossary.md>
68
69
70

.. toctree::
   :hidden:
71
   :caption: Components
72

73
   Backends <_sections/backends>
74
75
76
   Router <router/README>
   Planner <planner/planner_intro>
   KVBM <kvbm/kvbm_intro>
77
78
79

.. toctree::
   :hidden:
80
81
82
83
84
85
   :caption: Design Docs

   Overall Architecture <design_docs/architecture.md>
   Architecture Flow <design_docs/dynamo_flow.md>
   Disaggregated Serving <design_docs/disagg_serving.md>
   Distributed Runtime <design_docs/distributed_runtime.md>