Commits · 7f9d92edf516a5cf224bdc2f8aefaa953fbcf27c · OpenDAS / dynamo

15 May, 2025 1 commit
- fix: planner fixes DEP-78 DEP-94 (#1082) · 7f9d92ed
  mohammedabdulwahhab authored May 14, 2025
  
  7f9d92ed
13 May, 2025 1 commit

fix: bugfix - dynamo serve merge issue and service config fixes (#1036) · 1eab75d2

Biswa Panda authored May 13, 2025

Co-authored-by: Graham King <grahamk@nvidia.com>
Co-authored-by: hongkuan <hongkuanz@nvidia.com>
Co-authored-by: Ubuntu <ubuntu@crusoe-prod--inst-2wjuoekvfq72mlpdrcugujrtgfp.us-east1-a.compute.internal>

1eab75d2

09 May, 2025 2 commits
- fix: Extract tokenizer from GGUF for Qwen3 and Gemma3 arch (#1011) · d2768c22
  Graham King authored May 09, 2025
```
That avoids passing the `--model-config` param to dynamo-run when using llamacpp.
```
  d2768c22
- feat: decouple dynamo sdk to support mutiple deployment targets (#905) · d675d221
  Biswa Panda authored May 08, 2025
  
  d675d221
08 May, 2025 1 commit

feat: Qwen3, Gemma3 and Llama4 support (#1002) · ceaeba3e

Graham King authored May 08, 2025

. New mistralrs and llamacpp version
. mistralrs: Handle Gemma 3 and Llama 4 as vision models
. Update the dynamo-run docs to use Qwen 3
. Our pre-processor now supports Llama 4's newer multi-modal `config.json`
. Upgrade minijinja to handle Qwen 3's prompt template

For Llama 4 we'll need to limit the max seq len. vllm says:
> To serve at least one request with the models's max seq len (10485760), (240.00 GiB KV cache is needed,...

I was able to run Llama 4 with llamacpp and a quantized GGUF, with Dynamo doing the pre-processing.

ceaeba3e

07 May, 2025 3 commits
- fix: Fix vllm/sglang engine model name if using HF repo (#986) · 92bbbc39
  Graham King authored May 07, 2025
```
Signed-off-by: Graham King <graham@gkgk.org>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
```
  92bbbc39
- docs: add fix for Zsh globbing error with `pip install .[all]` (#945) · 412ec843
  祝健聪 authored May 08, 2025
```
Signed-off-by: Chasing1020 <chasing1020@gmail.com>
```
  412ec843
- chore: Remove embedded Python vllm and sglang engines (#966) · 42969800
  Graham King authored May 07, 2025
```
vllm and sglang are now the sub-process engines from #954

Also updated docs on doing vllm and sglang multi-gpu (tensor parallel) and multi-node (pipeline parallel).
```
  42969800
06 May, 2025 3 commits

docs: add drt doc (#951) · 2d4f8b50
Hongkuan Zhou authored May 06, 2025

2d4f8b50

feat(dynamo-run): vllm and sglang subprocess engines (#954) · 28fd481c

Graham King authored May 06, 2025

New vllm and sglang engines that run in a sub-process. Will hopefully replace the existing embedded python engines.
    
Why?
    
  - Pure Python, does not require knowing Rust to work on it. Much simpler to maintain.
  - No embedded Python interpreter which avoids linking libpython and avoids the MacOS virtualenv issues.
  - Should have better performance as it's "native" vllm / sglang.
  - Works with any version of vllm (including v1!) and sglang. Less upgrade struggle.

28fd481c

refactor: refactor dynamo deploy subfolder (#927) · 403344e5
hhzhang16 authored May 06, 2025

403344e5

05 May, 2025 1 commit
- fix: remove requirement for istio in doc (#950) · 829e1cf5
  julienmancuso authored May 05, 2025
  
  829e1cf5
29 Apr, 2025 2 commits
- docs: Fixes to dynamo deploy docs (#902) · d2635a7e
  mohammedabdulwahhab authored Apr 29, 2025
```
Signed-off-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
Co-authored-by: hhzhang16 <54051230+hhzhang16@users.noreply.github.com>
```
  d2635a7e
- docs: update pythonpath for starting planner (#890) · 562c7f51
  Hongkuan Zhou authored Apr 29, 2025
  
  562c7f51
28 Apr, 2025 3 commits
- docs: fix typo in planner documentation (#864) · 4a2b0e2c
  Zhongdongming Dai authored Apr 28, 2025
```
Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>
```
  4a2b0e2c
- chore: add docs around how runtime reconfiguration works (#861) · ee2c5938
  ishandhanani authored Apr 28, 2025
  
  ee2c5938
- docs: fix typo in disagg perf tuning guide (#859) · 1ff119c7
  Hongkuan Zhou authored Apr 28, 2025
```
Signed-off-by: Hongkuan Zhou <tedzhouhk@gmail.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
```
  1ff119c7
26 Apr, 2025 2 commits

docs: add docs for dynamo build (#714) · 94702c79
mohammedabdulwahhab authored Apr 25, 2025

94702c79

feat: local planner for 0.2.0 release (#398) · 7d5d6f8c

Hongkuan Zhou authored Apr 25, 2025

Signed-off-by: Hongkuan Zhou <tedzhouhk@gmail.com>
Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>
Co-authored-by: ishandhanani <ishandhanani@gmail.com>
Co-authored-by: Ubuntu <ubuntu@dev-inst-2w1vokvyuts83rzn4n1k7mnzew9.us-central1-a.c.brevdevprod.internal>
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
Co-authored-by: Anant Sharma <anants@nvidia.com>

7d5d6f8c

25 Apr, 2025 2 commits
- feat: add network configuration wizard during platform install (#820) · 1de737fe
  julienmancuso authored Apr 25, 2025
  
  1de737fe
- fix: remove dynamo cloud login (#824) · 12f72a42
  mohammedabdulwahhab authored Apr 25, 2025
  
  12f72a42
24 Apr, 2025 1 commit
- refactor: transition CLI to use typer for UX and testing (#703) · f27cdbcb
  ishandhanani authored Apr 24, 2025
```
Co-authored-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
```
  f27cdbcb
23 Apr, 2025 2 commits
- docs: Custom Backend/Worker Guide (#608) · 5ddb181c
  Ryan McCormick authored Apr 22, 2025
  
  5ddb181c
- feat: allow to CRUD dynamo pipelines (#761) · de77d3f9
  julienmancuso authored Apr 22, 2025
  
  de77d3f9
22 Apr, 2025 1 commit
- feat: add option to configure separate docker registry for pipelines docker images (#744) · 36172e6e
  julienmancuso authored Apr 22, 2025
  
  36172e6e
21 Apr, 2025 1 commit

chore(dynamo-run): Fix echo_core for EOS tokens (#759) · 4e75b04b

Graham King authored Apr 21, 2025

"echo_core" is an engine that echoes the post-processed request back to you so you can see the template. Good for testing. It needed an extra flag set to work correctly.

4e75b04b

18 Apr, 2025 4 commits
- chore: Remove TRT-LLM C++ engine in favor of Python one (#747) · 675a9bf5
  Graham King authored Apr 18, 2025
  
  675a9bf5
- feat(dynamo-engine-vllm): vllm 0.8.X support (#728) · a745a980
  Graham King authored Apr 18, 2025
```
It's different enough that I made a new engine vllm0_8 and renamed the previous engine to vllm0_7.

`dynamo-run out=vllm` now expects 0.8. This matches the container change in #690.

For older use `dynamo-run out=vllm0_7`.
```
  a745a980
- docs: add dedicated minikube guide (#735) · 9b05a5b7
  mohammedabdulwahhab authored Apr 17, 2025
  
  9b05a5b7
- fix: dynamo deploy helm chart cleanup (#727) · 831bc725
  mohammedabdulwahhab authored Apr 17, 2025
  
  831bc725
15 Apr, 2025 3 commits
- feat: replace dynamo server with dynamo cloud (#696) · da482c2f
  hhzhang16 authored Apr 15, 2025
  
  da482c2f
- docs: move deploy docs to docs/guides (#674) · 1c77531a
  hhzhang16 authored Apr 14, 2025
```
Signed-off-by: hhzhang16 <54051230+hhzhang16@users.noreply.github.com>
Co-authored-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
```
  1c77531a
- docs: Use the same term for dynamo base image across code snippets and text (#670) · efb3e7d4
  Maksim Khadkevich authored Apr 14, 2025
```
Signed-off-by: Maksim Khadkevich <mkhadkevich@nvidia.com>
```
  efb3e7d4
11 Apr, 2025 3 commits

fix: Edit typos in dynamo deploy diagram (#615) · e7a49b03
mohammedabdulwahhab authored Apr 10, 2025

e7a49b03

feat: TRT-LLM disaggregated serving using UCX (#562) · da38e96a

Tanmay Verma authored Apr 10, 2025


Signed-off-by: Tanmay Verma <tanmay2592@gmail.com>
Signed-off-by: Tanmay Verma <tanmayv@nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>

da38e96a

docs: add to documentation for Kubernetes deployments, devcontainer improvements (#498) · 441846de

hhzhang16 authored Apr 10, 2025

441846de

09 Apr, 2025 3 commits

feat: Extract Common Configs + Log Configs on Init + Add `test_` to... · 0292feb5

jon-chuang authored Apr 09, 2025


feat: Extract Common Configs + Log Configs on Init + Add `test_` to `sdk/tests` filenames required for pytest (#434)
Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>

0292feb5

docs: Move trtllm dynamo run doc from example to dynamo run guide (#578) · 0186aa7b
Tanmay Verma authored Apr 09, 2025

0186aa7b

docs: Updated dynamo run instructions (#555) · 16124e74

cdgamarose-nv authored Apr 09, 2025



#### Overview:

Updated the dynamo run doc `docs/guides/dynamo_run.md`

#### Details:

- Updated the instructions to make it clear which binary to use for built backends
- Reformatted the doc to make it more readable
- Added missing cmake library for ubuntu
Signed-off-by: Chantal D Gama Rose <cdgamarose@nvidia.com>

16124e74

08 Apr, 2025 1 commit
- docs: add disagg tuning guide (#413) · 0eacef76
  Hongkuan Zhou authored Apr 08, 2025
  
  0eacef76