- 05 Jun, 2025 9 commits
-
-
Neelay Shah authored
Signed-off-by:
Neelay Shah <neelays@nvidia.com> Co-authored-by:
pvijayakrish <pvijayakrish@nvidia.com> Co-authored-by:
coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
-
Anant Sharma authored
-
Anant Sharma authored
-
mohammedabdulwahhab authored
Signed-off-by:
mohammedabdulwahhab <furkhan324@berkeley.edu> Co-authored-by:
coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
-
Tanmay Verma authored
-
Yan Ru Pei authored
-
Hongkuan Zhou authored
-
Yan Ru Pei authored
Signed-off-by:
Yan Ru Pei <yanrpei@gmail.com> Co-authored-by:
Neelay Shah <neelays@nvidia.com>
-
Hongkuan Zhou authored
-
- 04 Jun, 2025 17 commits
-
-
Biswa Panda authored
-
julienmancuso authored
-
Paul Hendricks authored
-
Graham King authored
Publish `generation_config.json` from worker to ingress, as part of Model Deployment Card. That allows ingress to read key fields out of it. Gemma 3 4B+ has some important information that's only in there.
-
Kris Hung authored
-
Adit Ranadive authored
Need to reinstall the rdma-core and libibverbs to use RDMA devices. Also, docker container can be built using a recent version of UCX for EFA support. Signed-off-by:Adit Ranadive <aranadive@nvidia.com>
-
Suman Tatiraju authored
-
julienmancuso authored
-
hhzhang16 authored
feat: set model specific prompt templates in the multimodal config files, add documentation for multimodal example deployment (#1366)
-
julienmancuso authored
-
julienmancuso authored
-
Hongkuan Zhou authored
-
richardhuo-nv authored
-
mohammedabdulwahhab authored
-
Tom O'Brien authored
-
Kristen Kelleher authored
Signed-off-by:Kristen Kelleher <kkelleher@nvidia.com> - Content, format, and structural changes to the Dynamo docs for 0.3.0. - Includes copyediting and the first batch of changes from the DMO review.
-
jthomson04 authored
-
- 03 Jun, 2025 10 commits
-
-
ishandhanani authored
-
Tanmay Verma authored
-
Graham King authored
To talk to the vllm/sglang/trtllm engine we previously hardcoded an endpoint. The user never sees it so it doesn't matter which one. However if you try to run _two_ instances of Dynamo on one machine they will conflict. Use a UUID as the component name to resolve that. Part of the solution for: https://github.com/ai-dynamo/dynamo/issues/1073
-
hhzhang16 authored
-
Abrar Shivani authored
This PR modifies the mistralrs engine to ensure that the maximum output token length never exceeds the context length provided.
-
Paul Hendricks authored
-
J Wyman authored
Creates a README.md file for Connect. The README contains and overview, examples w/ diagrams, and documents the important classes. The README is not intended to be comprehensive. Instead it's meant to be more of a "getting started" or "learn the basics". More comprehensive information / documentation is available from the inline comments / documentation. Additionally, updates the Multimodal Example: Moves the remote and local prefill code from the generate method into remote_prefill and local_prefill respectively. Code changes made. Replaces reference to "agent" with "worker" for consistency reasons throughout the inline documentation. Only comments updated. No code changes made. The intention of this change is improve readability of the example code and to provide clearer examples to reference from within documentation. DIS-101
-
Hongkuan Zhou authored
Signed-off-by:
Hongkuan Zhou <tedzhouhk@gmail.com> Co-authored-by:
jothomson <jwillthomson19@gmail.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
Hongkuan Zhou authored
-
ptarasiewiczNV authored
-
- 02 Jun, 2025 4 commits
-
-
hhzhang16 authored
Signed-off-by:
hhzhang16 <54051230+hhzhang16@users.noreply.github.com> Co-authored-by:
coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
-
Ryan McCormick authored
-
Graham King authored
Do not include by default as it needs libgomp1 at runtime. Add a feature to enable it at build time.
-
julienmancuso authored
-