Commits · 04e892d196220d769d895a0932fc2f3464840348 · OpenDAS / dynamo

25 Apr, 2025 10 commits
- feat: misc changes while deploying (#831) · 04e892d1
  hhzhang16 authored Apr 25, 2025
  
  04e892d1
- chore: update vllm wheel dependency version (#828) · 3f5a44ab
  Anant Sharma authored Apr 25, 2025
  
  3f5a44ab
- fix: add VLLM_KV_CAPI_PATH to vllm dockerfile to make kv routing working (#832) · f5e8488c
  Ziqi Fan authored Apr 25, 2025
  
  f5e8488c
- feat: add network configuration wizard during platform install (#820) · 1de737fe
  julienmancuso authored Apr 25, 2025
  
  1de737fe
- build: update cudarc dependency to crate version (#815) · 448e79a6
  Anant Sharma authored Apr 25, 2025
  
  448e79a6
- fix: Change default vLLM router to round-robin (#597) · 0e4fffbc
  Piotr Marcinkiewicz authored Apr 25, 2025
  
  0e4fffbc
- fix: remove dynamo cloud login (#824) · 12f72a42
  mohammedabdulwahhab authored Apr 25, 2025
  
  12f72a42
- chore: Publish Model Deployment Card to NATS (#799) · d346782c
  Graham King authored Apr 25, 2025
```
This will allow an ingress-side pre-processor to see it without needing a model checkout.

Currently pre-processing is done in the worker, which has access to the model deployment card ("MDC") files (`config.json`, `tokenizer.json` and `tokenizer_config.json`) locally. We want to move the pre-processor to the ingress side to support KV routing. That requires ingress side (i.e the HTTP server), on a different machine than the worker to be able to see those three files.

To support that this PR makes the worker upload the contents of those files to the NATS object store, and publishes the MDC with those NATS urls to the key-value store. 

The key-value store has an interface so any store (nats, etcd, redis, etc) can be supported. Implementations for memory and NATS are provided.

Fetching the MDC from the store, doing pre-processing ingress side, and publishing a card backed by a GGUF, are all for a later commit.

Part of #743 
```
  d346782c
- refactor: refactor dynamo serve part-1/N (#788) · 16310b26
  Biswa Panda authored Apr 25, 2025
```
Co-authored-by: ishandhanani <ishandhanani@gmail.com>
```
  16310b26
- feat: remove proxy side car (#822) · dbdbd5e5
  julienmancuso authored Apr 24, 2025
  
  dbdbd5e5
24 Apr, 2025 9 commits
- docs: Update README.md (#821) · 21e97b0d
  Alec authored Apr 24, 2025
```
Signed-off-by: Alec <35311602+alec-flowers@users.noreply.github.com>
```
  21e97b0d
- refactor: transition CLI to use typer for UX and testing (#703) · f27cdbcb
  ishandhanani authored Apr 24, 2025
```
Co-authored-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
```
  f27cdbcb
- feat: remove old bento images (#801) · 4d02a463
  julienmancuso authored Apr 24, 2025
  
  4d02a463
- feat: Add unified x86 / aarch64 (ARM) build for TRTLLM image (#803) · c522253b
  Ryan McCormick authored Apr 24, 2025
```
Signed-off-by: Ryan McCormick <rmccormick@nvidia.com>
```
  c522253b
- feat: improve dynamo deployment CLI (#798) · c0bdf412
  hhzhang16 authored Apr 24, 2025
```
Co-authored-by: Julien Mancuso <jmancuso@nvidia.com>
```
  c0bdf412
- feat: Warm‑up mistral.rs engine to reduce latency on subsequent requests (#796) · 4761baa6
  Abrar Shivani authored Apr 24, 2025
```
Send a warm‑up request to the mistralrs engine so that subsequent requests are faster.
```
  4761baa6
- chore: Increase sleep times from 2s -> 30s for startup logs (#807) · aae0d405
  Ryan McCormick authored Apr 23, 2025
  
  aae0d405
- fix: Update TRTLLM version and fix disagg workflow (#804) · 197105eb
  Tanmay Verma authored Apr 23, 2025
  
  197105eb
- feat: Add linux aarch64 support to dynamo-run build (#802) · d757604c
  Ryan McCormick authored Apr 23, 2025
  
  d757604c
23 Apr, 2025 8 commits
- feat: rename operator CRDs (#795) · 26fe79dc
  julienmancuso authored Apr 23, 2025
  
  26fe79dc
- feat: Add log verbosity level flag to dynamo-run cli (#780) · a03fd307
  Abrar Shivani authored Apr 24, 2025
```
#### Overview:

This PR adds a command-line verbosity flag (-v, -vv) to dynamo-run to control log levels.
- Added new verbosity flag to Flags struct:
  - -v: Sets log level to debug
  - -vv: Sets log level to trace
  - No flag (default): Keeps log level at info

#### Details:
- closes GitHub issue: https://github.com/ai-dynamo/dynamo/issues/567
```
  a03fd307
- docs: add note to use release branch examples (#793) · ba0a51c4
  Anant Sharma authored Apr 23, 2025
```
Signed-off-by: Anant Sharma <anants@nvidia.com>
```
  ba0a51c4
- feat: remove bento/yatai references (#782) · f11ea8f7
  julienmancuso authored Apr 23, 2025
  
  f11ea8f7
- build: add rust binaries in manylinux image (#783) · ea84ab11
  Anant Sharma authored Apr 23, 2025
  
  ea84ab11
- chore: fix model arg name in multinode 405b example (#770) · 41f3e0e0
  KennyMcCormick authored Apr 24, 2025
```
Signed-off-by: cormick <cormick1080@gmail.com>
```
  41f3e0e0
- docs: Custom Backend/Worker Guide (#608) · 5ddb181c
  Ryan McCormick authored Apr 22, 2025
  
  5ddb181c
- feat: allow to CRUD dynamo pipelines (#761) · de77d3f9
  julienmancuso authored Apr 22, 2025
  
  de77d3f9
22 Apr, 2025 4 commits
- docs: R1 disaggregation guide (#720) · e06bfd55
  GuanLuo authored Apr 22, 2025
  
  e06bfd55
- chore: Update bug report to use dynamo env for collecting environment information (#558) · cce0c0f0
  Tushar Sharma authored Apr 22, 2025
  
  cce0c0f0
- feat: add option to configure separate docker registry for pipelines docker images (#744) · 36172e6e
  julienmancuso authored Apr 22, 2025
  
  36172e6e
- docs: deployment docs improvements (#753) · 5aa5d4b2
  hhzhang16 authored Apr 21, 2025
  
  5aa5d4b2
21 Apr, 2025 8 commits
- fix: give the user ownership permissions of /opt/dynamo/venv (#767) · 43dc9cee
  hhzhang16 authored Apr 21, 2025
  
  43dc9cee
- fix: Fix cancellation flow in python component graph (#765) · 420b7a82
  Pankaj Gupta authored Apr 21, 2025
  
  420b7a82
- feat: MLA disaggregation support to vLLM patch (#745) · 2972b7ed
  ptarasiewiczNV authored Apr 21, 2025
  
  2972b7ed
- chore: Add roadmap to main README.md (#763) · 85d8d02d
  Harry Kim authored Apr 21, 2025
```
Signed-off-by: Harry Kim <harry_kim@live.com>
```
  85d8d02d
- feat: add custom lease to worker components (#748) · c392c341
  ishandhanani authored Apr 21, 2025
  
  c392c341
- chore(dynamo-run): Fix echo_core for EOS tokens (#759) · 4e75b04b
  Graham King authored Apr 21, 2025
```
"echo_core" is an engine that echoes the post-processed request back to you so you can see the template. Good for testing. It needed an extra flag set to work correctly.
```
  4e75b04b
- feat: add additional packages to log filters (#752) · ee865ca0
  Abrar Shivani authored Apr 21, 2025
  
  ee865ca0
- feat(dynamo-run): make the model name to be the same as the HF repo name (#749) · f2e0d6c2
  Zhongdongming Dai authored Apr 21, 2025
  
  f2e0d6c2
18 Apr, 2025 1 commit
- docs: add aggregated deployment guide for multi-node sized model (#713) · 22cacbb1
  GuanLuo authored Apr 18, 2025
  
  22cacbb1