Commits · ed83e246f8ae56e0c3bd6a4f5fc1aafdda693d7e · OpenDAS / dynamo

17 Mar, 2025 10 commits
- chore: update nixl github commit (#214) · ed83e246
  Anant Sharma authored Mar 17, 2025
```
Co-authored-by: Piotr Tarasiewicz <ptarasiewicz@nvidia.com>
```
  ed83e246
- docs: Add disaggregated architecture mermaid diagram (#190) · 70266ec8
  ptarasiewiczNV authored Mar 17, 2025
```
Co-authored-by: hongkuanz <hongkuanz@nvidia.com>
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
Co-authored-by: Dmitry Tokarev <dtokarev@nvidia.com>
```
  70266ec8
- docs: Add documentation for Dynamo Architecture and key features (#207) · aca25898
  Suman Tatiraju authored Mar 17, 2025
  
  aca25898
- feat: expose Python binding for KVEventPublisher. Use event pub/sub trait for KV events (#169) · 6e09681e
  GuanLuo authored Mar 17, 2025
  
  6e09681e
- chore: refactor examples and clean CLI (#195) · df51a622
  ishandhanani authored Mar 16, 2025
  
  df51a622
- docs: Add Grafana Dashboard Image to Metrics README (#204) · 0517f757
  Ryan McCormick authored Mar 16, 2025
  
  0517f757
- fix: Fix KV Cache Hit Rate Metrics and Misc QoL Updates (#199) · c8737c1f
  Ryan McCormick authored Mar 16, 2025
  
  c8737c1f
- chore: removing outdated examples (#202) · b92834c8
  Neelay Shah authored Mar 16, 2025
  
  b92834c8
- chore: remove sdk toml file (#179) · fd79234f
  Anant Sharma authored Mar 16, 2025
  
  fd79234f
- build: add vllm optional dependency (#201) · 12600c73
  Anant Sharma authored Mar 16, 2025
  
  12600c73
16 Mar, 2025 10 commits
- chore: Update CODEOWNERS - DevOps team for CI/CD, containers/, legal (#194) · 3ba2d427
  Dmitry Tokarev authored Mar 16, 2025
  
  3ba2d427
- chore: update version for patched vLLM wheel (#193) · 761ff073
  Anant Sharma authored Mar 16, 2025
  
  761ff073
- docs: Update README with marketing notes (#184) · 9ae7dde7
  David Zier authored Mar 16, 2025
  
  9ae7dde7
- chore: adding kubernetes - works for target dev (#192) · 233f782d
  Neelay Shah authored Mar 16, 2025
  
  233f782d
- fix: Add nixl to runtime container (#191) · 5bf82e48
  ptarasiewiczNV authored Mar 16, 2025
```
Co-authored-by: hongkuanz <hongkuanz@nvidia.com>
```
  5bf82e48
- feat: runtime container has a launch screen (#185) · 635d98ce
  Harrison Saturley-Hall authored Mar 16, 2025
  
  635d98ce
- feat: add TTL to image builder job (#75) · a1b402e8
  julienmancuso authored Mar 15, 2025
```
Co-authored-by: Maksim Khadkevich <mkhadkevich@nvidia.com>
```
  a1b402e8
- fix: added missing dependency for sdk (#187) · 8cd8941c
  Maksim Khadkevich authored Mar 15, 2025
  
  8cd8941c
- feat: cli args to override service configs and small misc cleanups (#166) · afbf92fc
  ishandhanani authored Mar 15, 2025
  
  afbf92fc
- feat: update deploy api & sdk (#74) · 7f136e29
  April Yang authored Mar 15, 2025
```
Co-authored-by: Julien Mancuso <jmancuso@nvidia.com>
Co-authored-by: Hannah Zhang <hannahz@nvidia.com>
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
Co-authored-by: Maksim Khadkevich <mkhadkevich@nvidia.com>
```
  7f136e29
15 Mar, 2025 10 commits
- feat: add routerless processor based monolith example (#180) · dd238a26
  Biswa Panda authored Mar 15, 2025
  
  dd238a26
- feat: updated to override command line and support non equals and space in … (#182) · a509b8f6
  Neelay Shah authored Mar 15, 2025
  
  a509b8f6
- chore: Apply patch to vLLM wheel (#177) · a2773f3e
  ptarasiewiczNV authored Mar 15, 2025
  
  a2773f3e
- fix: Fix GenAI-Perf outdated dependency by manually building it (#149) · 29726360
  Matthew Kotila authored Mar 15, 2025
  
  29726360
- feat(deploy): Add examples for dynamo serve (#173) · b4aff959
  Biswa Panda authored Mar 15, 2025
```
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>
```
  b4aff959
- fix: Specify vLLM prebuilt wheel location (#176) · 177fb356
  ptarasiewiczNV authored Mar 15, 2025
  
  177fb356
- feat: Support a small runtime container (#167) · f64e2366
  Harrison Saturley-Hall authored Mar 15, 2025
  
  f64e2366
- chore: Update CODEOWNERS /deploy/ (#174) · 58e7461a
  Maksim Khadkevich authored Mar 14, 2025
  
  58e7461a
- fix: fix helm chart deployment (#172) · ffd47bca
  julienmancuso authored Mar 14, 2025
  
  ffd47bca
- feat(dynamo-run): Batch mode (#142) · 2cca070c
  Graham King authored Mar 14, 2025
```
```
  dynamo-run in=batch:prompts.jsonl out=mistralrs ~/llm_models/Llama-3.2-3B-Instruct/
```

The file has genai format, one entry per line:
```
  {"text": "the prompt"}
  {"text": ..etc
```

The prompt is evaluated and the output written to `output.jsonl` in the
same folder as the input.

At the end of the run various statistics are printed:
> Ran 5 files in 8s 679ms. Tokens in: 40 (5/s). Tokens out: 346 (43/s)

This is also helpful for pushing load into the system and stressing the
various components. Not intended for performance measurement, it's a
batch inference tool.
```
  2cca070c
14 Mar, 2025 10 commits

revert: "build: use wheel for vllm install (#163)" (#170) · 5cfcfe61
Anant Sharma authored Mar 14, 2025

5cfcfe61

feat(dynamo-run): Various UX improvements (#168) · 1fb31d6a

Graham King authored Mar 14, 2025

Engines mistralrs, sglang and vllm included by default. Can be disabled like this: `cargo build --no-default-features --features <add-back-what-you-want>`.

Added `--feature vulkan` option, for llamacpp.

Build time message if CUDA or Metal would help and are missing. That's the best we can do:
> warning: dynamo-run@0.1.0: CUDA not enabled, re-run with `--features cuda`

Runtime message if CUDA, Metal or Vulkan are enabled:
> 2025-03-14T21:59:26.501937Z  INFO dynamo_run: CUDA on

Runtime message if they are missing:
> 2025-03-14T22:02:37.439404Z  INFO dynamo_run: CPU mode. Rebuild with `--features cuda|metal|vulkan` for better performance

Defaut engine message includes available engines:
> 2025-03-14T21:59:26.503612Z  INFO dynamo_run: Using default engine: mistralrs. Use out=<engine> to specify one of echo_core, echo_full, mistralrs, llamacpp, sglang, vllm, pystr, pytok

The really important outcome is that this should now "just work":
```
cargo install dynamo-run
dynamo-run Qwen/Qwen2.5-3B-Instruct
```

Sadly you still need `--features cuda|metal` for performance, I couldn't automate that.

1fb31d6a

ci: Improve summarizing the test report (#153) · f465aca3
Pavithra Vijayakrishnan authored Mar 14, 2025

f465aca3
fix: indent issue (duplicated source code) (#165) · 0694d6b5
Hongkuan Zhou authored Mar 14, 2025

0694d6b5
feat: Support caching nixl build stage (#147) · 2abe926d
Ryan McCormick authored Mar 14, 2025

2abe926d
feat(sdk): add initial graph structure for prebuilt components (#130) · b8120504
ishandhanani authored Mar 14, 2025
```
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
```
b8120504

fix(mac): Fix for virtual env (#164) · 4f7f4b40

Graham King authored Mar 14, 2025

On Mac embedded python interpreters don't pick up the virtual env. This seems to be a known problem. Fix the sys.path.

4f7f4b40

chore: add alec to .py and .rs code owner (#162) · 663cde81
Hongkuan Zhou authored Mar 14, 2025

663cde81
fix: wrong indent -> only one worker metric (#161) · 75c249a2
Hongkuan Zhou authored Mar 14, 2025

75c249a2
build: use wheel for vllm install (#163) · 7713e25c
Anant Sharma authored Mar 14, 2025

7713e25c