Commits · 9bb1af33767fa51be2f3d90ac0cd3be9fd181353 · OpenDAS / dynamo

27 Oct, 2025 1 commit
- chore: Update version to 0.6.1 (#3916) · 9bb1af33
  Tushar Sharma authored Oct 27, 2025
```
Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
```
  9bb1af33
21 Oct, 2025 1 commit
- refactor(runtime): Replace std::sync::Mutex with parking_lot::Mutex (#3740) · 9ae98ed7
  Graham King authored Oct 21, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  9ae98ed7
13 Oct, 2025 1 commit
- chore: pre-0.6.0 activities (#3592) · cd2389ba
  Harrison Saturley-Hall authored Oct 13, 2025
```
Signed-off-by: Harrison Saturley-Hall <hsaturleyhal@nvidia.com>
```
  cd2389ba
30 Sep, 2025 1 commit
- chore: Move model_input, model_type from ModelEntry to ModelDeploymentCard (#3292) · 6ffd20a8
  Graham King authored Sep 30, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  6ffd20a8
24 Sep, 2025 1 commit
- chore: bump versions ahead of 0.5.1 release (#3209) · 980727bb
  Harrison Saturley-Hall authored Sep 24, 2025
```
Signed-off-by: Harrison Saturley-Hall <hsaturleyhal@nvidia.com>
```
  980727bb
05 Sep, 2025 1 commit
- fix: Load the tokenizer JSON once for chat and completions. (#2910) · cb5a657a
  Graham King authored Sep 05, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  cb5a657a
02 Sep, 2025 1 commit
- chore: bump version numbers ahead of 0.5.0 release (#2812) · 561ecb98
  Harrison Saturley-Hall authored Sep 02, 2025
```
Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
```
  561ecb98
22 Aug, 2025 1 commit
- chore: Rust to 1.89 and edition 2024 (#2659) · bce74588
  Graham King authored Aug 22, 2025
  
  bce74588
20 Aug, 2025 1 commit
- chore: Bumped Dynamo version to 0.4.1 (#2545) · 9a021885
  Dmitry Tokarev authored Aug 19, 2025
  
  9a021885
18 Aug, 2025 1 commit
- fix: replace metrics callback with background scraping to prevent tim… (#2480) · 04442173
  Keiven C authored Aug 18, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  04442173
13 Aug, 2025 1 commit
- feat: Allow an endpoint to serve multiple models (#2418) · 72ec5f5c
  Graham King authored Aug 13, 2025
  
  72ec5f5c
07 Aug, 2025 1 commit
- chore(metrics): Remove the Arc (#2357) · a3f7a39f
  Graham King authored Aug 07, 2025
  
  a3f7a39f
30 Jul, 2025 1 commit
- chore: Version bump to 0.4.0 (#2179) · 4c90b1b9
  Dmitry Tokarev authored Jul 30, 2025
  
  4c90b1b9
28 Jul, 2025 1 commit
- feat: Base metrics: add generic ingress handler metrics (#2090) · 615580d8
  Keiven C authored Jul 28, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  615580d8
22 Jul, 2025 1 commit

feat: add a hierarchical Prometheus MetricsRegistry trait for... · e5a8628f

Keiven C authored Jul 22, 2025

feat: add a hierarchical Prometheus MetricsRegistry trait for DistributedRuntime, Namespace, Components, and Endpoint (#2008)
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
Co-authored-by: Ryan Olson <rolson@nvidia.com>

e5a8628f

16 Jul, 2025 1 commit
- perf(router): Remove lock from router hot path (#1963) · aba60996
  Graham King authored Jul 16, 2025
  
  aba60996
08 Jul, 2025 1 commit
- feat: Build DistributedRuntime-level HTTP server with /health /metrics (#1656) · ece76a62
  ZichengMa authored Jul 08, 2025
  
  ece76a62
07 Jul, 2025 1 commit
- chore: update versions for 0.3.2 release (#1793) · c4935b34
  Anant Sharma authored Jul 07, 2025
  
  c4935b34
03 Jul, 2025 1 commit
- chore(engines): Upgrade mistralrs to 0.6.0 (#1767) · 4ab47617
  Graham King authored Jul 03, 2025
  
  4ab47617
13 Jun, 2025 1 commit
- chore: update dynamo and nixl versions for 0.3.1 (#1517) · 99e67e60
  Anant Sharma authored Jun 13, 2025
  
  99e67e60
29 May, 2025 1 commit
- chore: update dynamo and nixl versions for 0.3.0 (#1240) · 9d9a1d9b
  Anant Sharma authored May 29, 2025
  
  9d9a1d9b
09 May, 2025 2 commits
- chore: bump versions and NIXL dependencies for 0.2.1 (#1012) · e9cb035a
  Harrison Saturley-Hall authored May 09, 2025
  
  e9cb035a
- feat: allow adding auth to etcd (#980) · b2e401bc
  wxsm authored May 09, 2025
```
Allow both password or TLS auth, if none of these is provided fallback to no auth

Closes #657
```
  b2e401bc
25 Apr, 2025 2 commits

chore: bump NIXL version and package versions (#836) · 0715d469
Harrison Saturley-Hall authored Apr 25, 2025
```
Signed-off-by: Harrison Saturley-Hall <454891+saturley-hall@users.noreply.github.com>
```
0715d469

chore: Publish Model Deployment Card to NATS (#799) · d346782c

Graham King authored Apr 25, 2025

This will allow an ingress-side pre-processor to see it without needing a model checkout.

Currently pre-processing is done in the worker, which has access to the model deployment card ("MDC") files (`config.json`, `tokenizer.json` and `tokenizer_config.json`) locally. We want to move the pre-processor to the ingress side to support KV routing. That requires ingress side (i.e the HTTP server), on a different machine than the worker to be able to see those three files.

To support that this PR makes the worker upload the contents of those files to the NATS object store, and publishes the MDC with those NATS urls to the key-value store.

The key-value store has an interface so any store (nats, etcd, redis, etc) can be supported. Implementations for memory and NATS are provided.

Fetching the MDC from the store, doing pre-processing ingress side, and publishing a card backed by a GGUF, are all for a later commit.

Part of #743

d346782c

09 Apr, 2025 1 commit
- chore: update versions to 0.1.1 (#552) · fa7ee14c
  Anant Sharma authored Apr 09, 2025
  
  fa7ee14c
31 Mar, 2025 1 commit
- refactor: prometheus upgrade (#452) · de290537
  Ryan Olson authored Mar 31, 2025
  
  de290537
19 Mar, 2025 1 commit

chore: Don't depend on openssl (#292) · 7c3fd5c9

Graham King authored Mar 19, 2025

This makes the Rust parts all use ring / rustls library instead of local install of openssl. It's a step on the journey to being statically linked.

Pieces:
- `tokenizers` and `mistralrs` now support rustls (mistralrs by default, tokenizers with feature flag).
- Move shared dependencies up into workspace
- New `rand` crate has some renames for future rust
- Ensure the dependency doesn't creep back in by enforcing it with cargo deny.

7c3fd5c9

14 Mar, 2025 1 commit
- feat: global kv block manager (#45) · f04359cf
  Ryan Olson authored Mar 13, 2025
  
  f04359cf
13 Mar, 2025 1 commit
- build: add top level rust workspace (#137) · 3d292851
  Anant Sharma authored Mar 13, 2025
  
  3d292851
11 Mar, 2025 1 commit
- refactor: Move rust binaries out of examples, update nixl dockerfile (#89) · e5db9e86
  Neelay Shah authored Mar 11, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  e5db9e86
10 Mar, 2025 1 commit
- chore: update wheel name and reset versions (#73) · fc4da345
  Anant Sharma authored Mar 10, 2025
  
  fc4da345
08 Mar, 2025 1 commit
- chore: rename dynamo (#44) · 602352ce
  Neelay Shah authored Mar 08, 2025
```
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
```
  602352ce
05 Mar, 2025 1 commit
- refactor: rename triton_distributed to dynemo (#22) · 1af7433b
  Neelay Shah authored Mar 05, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
```
  1af7433b
28 Feb, 2025 1 commit
- feat: TensorRT-LLM engine (#317) · 057f8f47
  Graham King authored Feb 28, 2025
```
Engine, `tio` support and docs.

Proof of concept / experimental.
```
  057f8f47
27 Feb, 2025 1 commit
- ci: build wheel from root directory (#274) · ea401e3b
  Anant Sharma authored Feb 27, 2025
  
  ea401e3b
26 Feb, 2025 2 commits
- refactor: using async_openai · 86aff237
  Paul Hendricks authored Feb 26, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
```
  86aff237
- fix: Fix stream::until_deadline bug and improve metric examples (#280) · 494d5625
  Ryan McCormick authored Feb 26, 2025
```
Co-authored-by: Ryan Olson <rolson@nvidia.com>
```
  494d5625
25 Feb, 2025 2 commits

feat: sglang backend for tio (#271) · e97493eb

Graham King authored Feb 25, 2025

- Setup venv

```
uv venv
source .venv/bin/activate
uv pip install pip
uv pip install sgl-kernel --force-reinstall --no-deps
uv pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/
```

- Build: `cargo build --release --features sglang`

- Run single node (make sure you're in the venv): `./tio out=sglang ~/llm_models/my_model`

- Run Deepseek multi-gpu / multi-node:

Node 1:
```
tio in=http out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 0 --dist-init-addr 10.217.98.122:9876
```

Node 2:
```
tio in=none out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 1 --dist-init-addr 10.217.98.122:9876
```

e97493eb

feat: Add completion endpoint to http server and llmctl (#230) · b760c569
Alec authored Feb 25, 2025
```
Co-authored-by: aflowers <aflowers@nvidia.com>
```
b760c569