- 03 Jun, 2025 8 commits
-
-
Graham King authored
To talk to the vllm/sglang/trtllm engine we previously hardcoded an endpoint. The user never sees it so it doesn't matter which one. However if you try to run _two_ instances of Dynamo on one machine they will conflict. Use a UUID as the component name to resolve that. Part of the solution for: https://github.com/ai-dynamo/dynamo/issues/1073
-
hhzhang16 authored
-
Abrar Shivani authored
This PR modifies the mistralrs engine to ensure that the maximum output token length never exceeds the context length provided.
-
Paul Hendricks authored
-
J Wyman authored
Creates a README.md file for Connect. The README contains and overview, examples w/ diagrams, and documents the important classes. The README is not intended to be comprehensive. Instead it's meant to be more of a "getting started" or "learn the basics". More comprehensive information / documentation is available from the inline comments / documentation. Additionally, updates the Multimodal Example: Moves the remote and local prefill code from the generate method into remote_prefill and local_prefill respectively. Code changes made. Replaces reference to "agent" with "worker" for consistency reasons throughout the inline documentation. Only comments updated. No code changes made. The intention of this change is improve readability of the example code and to provide clearer examples to reference from within documentation. DIS-101
-
Hongkuan Zhou authored
Signed-off-by:
Hongkuan Zhou <tedzhouhk@gmail.com> Co-authored-by:
jothomson <jwillthomson19@gmail.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
Hongkuan Zhou authored
-
ptarasiewiczNV authored
-
- 02 Jun, 2025 10 commits
-
-
hhzhang16 authored
Signed-off-by:
hhzhang16 <54051230+hhzhang16@users.noreply.github.com> Co-authored-by:
coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
-
Ryan McCormick authored
-
Graham King authored
Do not include by default as it needs libgomp1 at runtime. Add a feature to enable it at build time.
-
julienmancuso authored
-
Graham King authored
This allows building: - only `mistral.rs` engine: `--no-default-features --features mistralrs` - or only `llama.cpp` engine: `--no-default-features --features llamacpp`. Since llama.cpp became a default we'd only tested building both at once. The docs already said we supported that but there was some combo of Rust features that didn't build. This is the fix.
-
ptarasiewiczNV authored
-
julienmancuso authored
-
Ryan McCormick authored
-
Hongkuan Zhou authored
-
Graham King authored
It was confusing to have two names for one type. This tidy up started in #1064 , is now complete.
-
- 31 May, 2025 3 commits
-
-
Ryan McCormick authored
-
Hongkuan Zhou authored
-
mohammedabdulwahhab authored
-
- 30 May, 2025 13 commits
-
-
Biswa Panda authored
-
Olga Andreeva authored
-
Kris Hung authored
-
Ryan McCormick authored
-
jain-ria authored
-
Graham King authored
Unify them with all our other logs, so we can filter with DYN_LOG, they will eventually go to the log aggregation, etc.
-
Anant Sharma authored
-
Alec authored
-
jthomson04 authored
-
julienmancuso authored
-
ishandhanani authored
-
Biswa Panda authored
-
Tanmay Verma authored
-
- 29 May, 2025 6 commits
-
-
Graham King authored
Previously `mistral.rs` was the default engine for both safetensors and GGUF models. Now it is only the default for safetensors, `llama.cpp` becomes the default for GGUF. Why? - Since #1177 `llama.cpp` is built-in by default, so we can switch. - `llama.cpp` is very very good at running GGUF (but can't run other types of model), so we should switch. Dynamo's multi-engine support gives us a secret super-power: we can use the best engine for this specific format or model. We can still run GGUF with mistralrs by doing `out=mistralrs`.
-
Tanmay Verma authored
-
jthomson04 authored
-
J Wyman authored
This change corrects the README.md file in the examples/multimodal folder: - Correct "vllm worker" to "decode worker" - Correct assertion that data is moved via NATS when embeddings are moved via RDMA. Additionally, this change updates the textual graphs with Mermaid graphs for improved presentation on github.com.
-
Alec authored
-
Graham King authored
-