Commits · 6795e645d013da6dfb48cce847512d05ea283313 · OpenDAS / dynamo

02 Apr, 2025 1 commit
- feat: kv aware router executable (#399) · c4106e6a
  Ryan Olson authored Apr 02, 2025
  
  c4106e6a
01 Apr, 2025 1 commit
- feat: unified logging (#472) · 5b682f48
  Ryan Olson authored Apr 01, 2025
  
  5b682f48
31 Mar, 2025 1 commit
- refactor: prometheus upgrade (#452) · de290537
  Ryan Olson authored Mar 31, 2025
  
  de290537
19 Mar, 2025 2 commits

fix: update crates metadata (#264) · 68d953f7
Anant Sharma authored Mar 19, 2025
```
Co-authored-by: Dmitry Tokarev <dtokarev@nvidia.com>
```
68d953f7

chore: Don't depend on openssl (#292) · 7c3fd5c9

Graham King authored Mar 19, 2025

This makes the Rust parts all use ring / rustls library instead of local install of openssl. It's a step on the journey to being statically linked.

Pieces:
- `tokenizers` and `mistralrs` now support rustls (mistralrs by default, tokenizers with feature flag).
- Move shared dependencies up into workspace
- New `rand` crate has some renames for future rust
- Ensure the dependency doesn't creep back in by enforcing it with cargo deny.

7c3fd5c9

18 Mar, 2025 2 commits
- docs: fix links in docs (#256) · 548578f4
  Dmitry Tokarev authored Mar 18, 2025
```
Co-authored-by: Anant Sharma <anants@nvidia.com>
```
  548578f4
- fix: temporary documentation for crates.io (#255) · 1ccd4caa
  Harrison Saturley-Hall authored Mar 18, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  1ccd4caa
17 Mar, 2025 2 commits
- fix(runtime): Shutdown message from eprintln to tracing debug (#219) · f46f6d0e
  Graham King authored Mar 17, 2025
  
  f46f6d0e
- feat: expose Python binding for KVEventPublisher. Use event pub/sub trait for KV events (#169) · 6e09681e
  GuanLuo authored Mar 17, 2025
  
  6e09681e
14 Mar, 2025 3 commits
- refactor: Update default log level to INFO and promote/demote a few log messages (#159) · 6a93d2c7
  Ryan McCormick authored Mar 14, 2025
  
  6a93d2c7
- fix: Fix cargo doc warnings for lib/runtime (#150) · 0f4c1c58
  Ryan McCormick authored Mar 14, 2025
  
  0f4c1c58
- feat: global kv block manager (#45) · f04359cf
  Ryan Olson authored Mar 13, 2025
  
  f04359cf
13 Mar, 2025 2 commits

build: add top level rust workspace (#137) · 3d292851
Anant Sharma authored Mar 13, 2025

3d292851

feat(dynamo-run): Download models from HF, smart model defaults (#126) · 089f8e1b

Graham King authored Mar 12, 2025



- Any engine can take the name of a Hugging Face repository. It will be downloaded before calling the engine.

- The default engine (previously always mistralrs) depends on what is compiled in.

- Text can be piped in and will result in a single run of the model.

All of those together mean if you build with `--features vllm` you can do this and it will download the model and run it with vllm, answer your question, and exit:
```
echo "What is the capital of Costa Rica?"  | dynamo-run Qwen/Qwen2.5-3B-Instruct
```
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

089f8e1b

11 Mar, 2025 1 commit
- refactor: Move rust binaries out of examples, update nixl dockerfile (#89) · e5db9e86
  Neelay Shah authored Mar 11, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  e5db9e86
10 Mar, 2025 1 commit
- chore: update wheel name and reset versions (#73) · fc4da345
  Anant Sharma authored Mar 10, 2025
  
  fc4da345
09 Mar, 2025 2 commits

chore: stragglers rename (#69) · dd31a322
Neelay Shah authored Mar 09, 2025
```
Co-authored-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
```
dd31a322

chore: left over renaming (#67) · 678cffb4

Neelay Shah authored Mar 09, 2025


Co-authored-by: Harrison Saturley-Hall <454891+saturley-hall@users.noreply.github.com>
Co-authored-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>

678cffb4

08 Mar, 2025 2 commits
- chore: Renamed Triton Distributed to Dynamo (#56) · b4d56a57
  Dmitry Tokarev authored Mar 08, 2025
  
  b4d56a57
- chore: rename dynamo (#44) · 602352ce
  Neelay Shah authored Mar 08, 2025
```
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
```
  602352ce
07 Mar, 2025 2 commits

fix: dynemo-run model discovery working again (#52) · 9f53922a

Graham King authored Mar 07, 2025

There are two etcd keys:
- The service
- The model

The second one is the interesting one for us. Previously we confused the two.

9f53922a

refactor: Use library constant for kv-hit-rate subject (#48) · 2ee29443
Ryan McCormick authored Mar 07, 2025
```
Replaces hard-coded "kv-hit-rate" string in multiple places with KV_HIT_RATE_SUBJECT constant in lib/llm.
```
2ee29443

06 Mar, 2025 2 commits
- feat: Add estimated kv cache hit metric events (#30) · 09656f6c
  Ryan McCormick authored Mar 06, 2025
  
  09656f6c
- refactor: Simplify codespell configuration, allow contractions, add custom dictionary (#28) · e1ae9aa0
  Ryan McCormick authored Mar 05, 2025
  
  e1ae9aa0
05 Mar, 2025 1 commit
- refactor: rename triton_distributed to dynemo (#22) · 1af7433b
  Neelay Shah authored Mar 05, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
```
  1af7433b
03 Mar, 2025 1 commit

fix: Install specific toolchain (#329) · 2d906fb4

Graham King authored Mar 03, 2025

`cargo build --locked` won't let you use "1.85.0" if you only have "stable" installed, even if those are the same thing right now.

2d906fb4

27 Feb, 2025 2 commits
- refactor: service/endpoint stats_handler (#282) · 85cc7b67
  Ryan Olson authored Feb 27, 2025
  
  85cc7b67
- ci: build wheel from root directory (#274) · ea401e3b
  Anant Sharma authored Feb 27, 2025
  
  ea401e3b
26 Feb, 2025 3 commits

fix: Fix stream::until_deadline bug and improve metric examples (#280) · 494d5625
Ryan McCormick authored Feb 26, 2025
```
Co-authored-by: Ryan Olson <rolson@nvidia.com>
```
494d5625

feat: Endpoint defaults for namespace/component/other (#277) · 31d27ab2

Graham King authored Feb 26, 2025

This means we don't need to explain the parts to the users until they are ready. We use what they provide and default the rest.

Allows all of this and more:
- `tio out=tdr://test`
- `tio out=tdr://llama_8b_pool`
- `tio in=tdr://corp_ai_research_group/model_next-20250226`
- `tio out=tdr://AIRE.NIM.migrate.mistralrs.1802`

Python, API, etc all untouched.

31d27ab2

ci: fix rust deny workflow (#275) · 76439997
Anant Sharma authored Feb 26, 2025

76439997

25 Feb, 2025 4 commits
- chore: updating docs after restructure · c70de37f
  Neelay Shah authored Feb 25, 2025
  
  c70de37f
- feat: Add completion endpoint to http server and llmctl (#230) · b760c569
  Alec authored Feb 25, 2025
```
Co-authored-by: aflowers <aflowers@nvidia.com>
```
  b760c569
- refactor: adds `TryFrom<&str>` and `FromStr` for `Endpoint` (#263) · e0e9f4a2
  Paul Hendricks authored Feb 25, 2025
  
  e0e9f4a2
- refactor: move libs to lib dir · 08fcd7e9
  Neelay Shah authored Feb 24, 2025
```
Signed-off-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
```
  08fcd7e9