Commits · b950ec54c2b59cc82eab137084ed8000ed5c9106 · OpenDAS / dynamo

22 May, 2025 1 commit
- fix: osrb documents with full licenses for each dependency (#1169) · b950ec54
  Harrison Saturley-Hall authored May 22, 2025
  
  b950ec54
20 May, 2025 2 commits
- fix: more OSRB updates for 0.2.1 (#1143) · 2eb4a064
  Harrison Saturley-Hall authored May 20, 2025
  
  2eb4a064
- fix: update attributions files for OSRB compliance (#1137) · c48bf7f6
  Harrison Saturley-Hall authored May 20, 2025
  
  c48bf7f6
15 May, 2025 2 commits
- fix: planner fixes DEP-78 DEP-94 (#1082) · 7f9d92ed
  mohammedabdulwahhab authored May 14, 2025
  
  7f9d92ed
- build: remove block-manager feature flag for runtime wheel (#1083) · 995cf55c
  Anant Sharma authored May 14, 2025
  
  995cf55c
14 May, 2025 2 commits
- build: add nixl install to trtllm dockerfile (#1045) (#1076) · b527d240
  Anant Sharma authored May 14, 2025
  
  b527d240
- chore: bump nixl 0.2.1rc3 (#1067) · 565a636b
  Harrison Saturley-Hall authored May 14, 2025
  
  565a636b
13 May, 2025 3 commits
- fix: bugfix - dynamo serve merge issue and service config fixes (#1036) · 1eab75d2
  Biswa Panda authored May 13, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
Co-authored-by: hongkuan <hongkuanz@nvidia.com>
Co-authored-by: Ubuntu <ubuntu@crusoe-prod--inst-2wjuoekvfq72mlpdrcugujrtgfp.us-east1-a.compute.internal>
```
  1eab75d2
- fix: update nixl setup for arm builds (#1061) (#1062) · f42a09af
  Anant Sharma authored May 13, 2025
  
  f42a09af
- fix: use correct lease id for kv router (cherry-pick to release/0.2.1) (#1059) · 56e18dc3
  Hongkuan Zhou authored May 13, 2025
  
  56e18dc3
12 May, 2025 1 commit
- fix: pin click dependency to old releases (#1042) (#1043) · c183df1f
  Anant Sharma authored May 12, 2025
  
  c183df1f
10 May, 2025 2 commits
- chore: sglang deps (#1025) · 62a0f136
  ishandhanani authored May 09, 2025
  
  62a0f136
- chore: bump NIXL commit hash to 0.2.1-rc2 (#1023) · c56d0dea
  Harrison Saturley-Hall authored May 09, 2025
  
  c56d0dea
09 May, 2025 8 commits
- feat: kv block manager (#965) (#1021) · 42ce6931
  Harrison Saturley-Hall authored May 09, 2025
```
Co-authored-by: Ryan Olson <ryanolson@users.noreply.github.com>
```
  42ce6931
- chore: fix sglang deps when installing on ARM (#1020) · cafc74eb
  ishandhanani authored May 09, 2025
  
  cafc74eb
- fix: Extract tokenizer from GGUF for Qwen3 and Gemma3 arch (#1011) · d2768c22
  Graham King authored May 09, 2025
```
That avoids passing the `--model-config` param to dynamo-run when using llamacpp.
```
  d2768c22
- chore: bump versions and NIXL dependencies for 0.2.1 (#1012) · e9cb035a
  Harrison Saturley-Hall authored May 09, 2025
  
  e9cb035a
- feat: allow adding auth to etcd (#980) · b2e401bc
  wxsm authored May 09, 2025
```
Allow both password or TLS auth, if none of these is provided fallback to no auth

Closes #657
```
  b2e401bc
- feat: decouple dynamo sdk to support mutiple deployment targets (#905) · d675d221
  Biswa Panda authored May 08, 2025
  
  d675d221
- feat(sglang): aggregated support (#937) · 5d5235bc
  ishandhanani authored May 08, 2025
```
Co-authored-by: ishandhanani <ishandhananai@gmail.com>
```
  5d5235bc
- feat: Add AWS EFA support (#999) · bdf60ca0
  Adit Ranadive authored May 08, 2025
```
NIXL uses UCX which will have support for EFA since 1.19. Explicitly
use the 1.19 branch for UCX with Dynamo.
Signed-off-by: Adit Ranadive <aranadive@nvidia.com>
```
  bdf60ca0
08 May, 2025 9 commits
- refactor: use primary lease + self-contained graceful shutdown trigged by SIGINT/SIGTERM (#1001) · 466b8e5f
  Hongkuan Zhou authored May 08, 2025
  
  466b8e5f
- feat: deploy planner in operator (#921) · b2aa2317
  julienmancuso authored May 08, 2025
```
Co-authored-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
```
  b2aa2317
- feat: Remove vllm and sglang from cargo build command (#1003) · 57975b27
  hhzhang16 authored May 08, 2025
  
  57975b27
- feat: Qwen3, Gemma3 and Llama4 support (#1002) · ceaeba3e
  Graham King authored May 08, 2025
```
. New mistralrs and llamacpp version
. mistralrs: Handle Gemma 3 and Llama 4 as vision models
. Update the dynamo-run docs to use Qwen 3
. Our pre-processor now supports Llama 4's newer multi-modal `config.json`
. Upgrade minijinja to handle Qwen 3's prompt template

For Llama 4 we'll need to limit the max seq len. vllm says:
> To serve at least one request with the models's max seq len (10485760), (240.00 GiB KV cache is needed,...

I was able to run Llama 4 with llamacpp and a quantized GGUF, with Dynamo doing the pre-processing.
```
  ceaeba3e
- docs: Add slurm env var workaround for MPI spawn errors (#992) · 57402e70
  Ryan McCormick authored May 08, 2025
  
  57402e70
- fix: typo in devcontainer ulimit nofile (#994) · 02145479
  Anthony Casagrande authored May 08, 2025
```
Signed-off-by: Anthony Casagrande <acasagrande@nvidia.com>
```
  02145479
- fix: should route based on waiting requests, not active (#989) · 8bdf18e5
  Yan Ru Pei authored May 08, 2025
  
  8bdf18e5
- ci: add PR labels and config for github release notes (#955) · 5c98f8d1
  Anant Sharma authored May 08, 2025
  
  5c98f8d1
- feat: add ingress to graph deployments (#960) · 1e8b2866
  hhzhang16 authored May 07, 2025
  
  1e8b2866
07 May, 2025 10 commits
- feat: cleanup EtcdKvCache and PrefillQueue before and after launch (#925) · a590d103
  Hongkuan Zhou authored May 07, 2025
  
  a590d103
- feat: Add multimodal example with disaggregated serving (#811) · 10e91264
  Kris Hung authored May 07, 2025
  
  10e91264
- fix: Fix vllm/sglang engine model name if using HF repo (#986) · 92bbbc39
  Graham King authored May 07, 2025
```
Signed-off-by: Graham King <graham@gkgk.org>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
```
  92bbbc39
- fix: Check nvext for ignore_eos and set min_tokens for benchmark consistency (#988) · 0a894cc3
  Ryan McCormick authored May 07, 2025
  
  0a894cc3
- feat: add interface for deployment manager (#987) · dc3ae2b7
  Biswa Panda authored May 07, 2025
  
  dc3ae2b7
- build: Cleans the TensorRTLLM + Dynamo container build (#968) · 7dd79013
  Tanmay Verma authored May 07, 2025
```
Signed-off-by: Tanmay Verma <tanmay2592@gmail.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
```
  7dd79013
- docs: add fix for Zsh globbing error with `pip install .[all]` (#945) · 412ec843
  祝健聪 authored May 08, 2025
```
Signed-off-by: Chasing1020 <chasing1020@gmail.com>
```
  412ec843
- fix: increase ulimit nofile for container (#969) · 3c3cec97
  Anthony Casagrande authored May 07, 2025
  
  3c3cec97
- chore: Remove embedded Python vllm and sglang engines (#966) · 42969800
  Graham King authored May 07, 2025
```
vllm and sglang are now the sub-process engines from #954

Also updated docs on doing vllm and sglang multi-gpu (tensor parallel) and multi-node (pipeline parallel).
```
  42969800
- fix: Create default sampling params only once during initialization (#982) · 5d89a0c8
  ptarasiewiczNV authored May 07, 2025
  
  5d89a0c8