- 02 Apr, 2025 1 commit
-
-
Ryan Olson authored
-
- 01 Apr, 2025 1 commit
-
-
Ryan Olson authored
-
- 31 Mar, 2025 1 commit
-
-
Ryan Olson authored
-
- 19 Mar, 2025 2 commits
-
-
Anant Sharma authored
Co-authored-by:Dmitry Tokarev <dtokarev@nvidia.com>
-
Graham King authored
This makes the Rust parts all use ring / rustls library instead of local install of openssl. It's a step on the journey to being statically linked. Pieces: - `tokenizers` and `mistralrs` now support rustls (mistralrs by default, tokenizers with feature flag). - Move shared dependencies up into workspace - New `rand` crate has some renames for future rust - Ensure the dependency doesn't creep back in by enforcing it with cargo deny.
-
- 18 Mar, 2025 2 commits
-
-
Dmitry Tokarev authored
Co-authored-by:Anant Sharma <anants@nvidia.com>
-
Harrison Saturley-Hall authored
Co-authored-by:Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
-
- 17 Mar, 2025 2 commits
-
-
Graham King authored
-
GuanLuo authored
-
- 14 Mar, 2025 3 commits
-
-
Ryan McCormick authored
-
Ryan McCormick authored
-
Ryan Olson authored
-
- 13 Mar, 2025 2 commits
-
-
Anant Sharma authored
-
Graham King authored
- Any engine can take the name of a Hugging Face repository. It will be downloaded before calling the engine. - The default engine (previously always mistralrs) depends on what is compiled in. - Text can be piped in and will result in a single run of the model. All of those together mean if you build with `--features vllm` you can do this and it will download the model and run it with vllm, answer your question, and exit: ``` echo "What is the capital of Costa Rica?" | dynamo-run Qwen/Qwen2.5-3B-Instruct ``` Co-authored-by:Ryan McCormick <rmccormick@nvidia.com>
-
- 11 Mar, 2025 1 commit
-
-
Neelay Shah authored
Co-authored-by:Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
-
- 10 Mar, 2025 1 commit
-
-
Anant Sharma authored
-
- 09 Mar, 2025 2 commits
-
-
Neelay Shah authored
Co-authored-by:Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
-
Neelay Shah authored
Co-authored-by:
Harrison Saturley-Hall <454891+saturley-hall@users.noreply.github.com> Co-authored-by:
Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
-
- 08 Mar, 2025 2 commits
-
-
Dmitry Tokarev authored
-
Neelay Shah authored
Co-authored-by:Biswa Panda <biswa.panda@gmail.com>
-
- 07 Mar, 2025 2 commits
-
-
Graham King authored
There are two etcd keys: - The service - The model The second one is the interesting one for us. Previously we confused the two.
-
Ryan McCormick authored
Replaces hard-coded "kv-hit-rate" string in multiple places with KV_HIT_RATE_SUBJECT constant in lib/llm.
-
- 06 Mar, 2025 2 commits
-
-
Ryan McCormick authored
-
Ryan McCormick authored
-
- 05 Mar, 2025 1 commit
-
-
Neelay Shah authored
Co-authored-by:Graham King <grahamk@nvidia.com>
-
- 03 Mar, 2025 1 commit
-
-
Graham King authored
`cargo build --locked` won't let you use "1.85.0" if you only have "stable" installed, even if those are the same thing right now.
-
- 27 Feb, 2025 2 commits
-
-
Ryan Olson authored
-
Anant Sharma authored
-
- 26 Feb, 2025 3 commits
-
-
Ryan McCormick authored
Co-authored-by:Ryan Olson <rolson@nvidia.com>
-
Graham King authored
This means we don't need to explain the parts to the users until they are ready. We use what they provide and default the rest. Allows all of this and more: - `tio out=tdr://test` - `tio out=tdr://llama_8b_pool` - `tio in=tdr://corp_ai_research_group/model_next-20250226` - `tio out=tdr://AIRE.NIM.migrate.mistralrs.1802` Python, API, etc all untouched.
-
Anant Sharma authored
-
- 25 Feb, 2025 4 commits
-
-
Neelay Shah authored
-
Alec authored
Co-authored-by:aflowers <aflowers@nvidia.com>
-
Paul Hendricks authored
-
Neelay Shah authored
Signed-off-by:
Neelay Shah <neelays@nvidia.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-