Commits · 00e54337377a3003d47f08d3f0767825a4b084fa · OpenDAS / dynamo

20 Mar, 2025 3 commits

chore: Make debug profile use all optimizations (#317) · 00e54337

Graham King authored Mar 20, 2025

It hardly slows the build down, and it makes things run much faster. That allows us to switch to the debug (default) profile for development, and keep the release profile for, well, releasing.

Motivated by changes in https://github.com/ai-dynamo/dynamo/pull/279

00e54337

feat: add more useful APIs for tokens (#313) · d4d93b6a

Nora authored Mar 20, 2025



Add `AsMut`, `DerefMut` and `IntoIterator` trait impl for the `Tokens` structure.
Signed-off-by: nora-coder-dot <nora6677@gmail.com>
Co-authored-by: nora-coder-dot <nora6677@gmail.com>

d4d93b6a

fix: helm tmpl (#307) · 001b07d9
gujing authored Mar 20, 2025
```
Signed-off-by: zibai <zibai.gj@alibaba-inc.com>
```
001b07d9

19 Mar, 2025 10 commits

feat: `Frontend` component uses served_model_name instead of model (#302) · 1f6ccc7f
ishandhanani authored Mar 19, 2025

1f6ccc7f
chore: remove older unused components (#300) · 476174f3
ishandhanani authored Mar 19, 2025

476174f3
chore: Update dynamo.code-workspace (#282) · 19a8a6ec
Elton Leander Pinto authored Mar 19, 2025
```
Co-authored-by: Ryan Olson <ryanolson@users.noreply.github.com>
```
19a8a6ec
fix: update crates metadata (#264) · 68d953f7
Anant Sharma authored Mar 19, 2025
```
Co-authored-by: Dmitry Tokarev <dtokarev@nvidia.com>
```
68d953f7
fix: Add __init__.py for compoments folder in llm example (#299) · ff3413be
Piotr Marcinkiewicz authored Mar 19, 2025

ff3413be

chore: Don't depend on openssl (#292) · 7c3fd5c9

Graham King authored Mar 19, 2025

This makes the Rust parts all use ring / rustls library instead of local install of openssl. It's a step on the journey to being statically linked.

Pieces:
- `tokenizers` and `mistralrs` now support rustls (mistralrs by default, tokenizers with feature flag).
- Move shared dependencies up into workspace
- New `rand` crate has some renames for future rust
- Ensure the dependency doesn't creep back in by enforcing it with cargo deny.

7c3fd5c9

feat: enable LTO and codegen-units = 1 optimizations (#279) · af8ee9db

Alexander Zaitsev authored Mar 19, 2025

#### Overview:

This PR enables more aggressive compiler optimizations for the project which should lead to better performance and smaller binary sizes.

In this PR, I decided to use Fat LTO instead of ThinLTO since it provides higher optimization level.

I have made quick tests (AMD Ryzen 5900x, Fedora 41, Rust 1.85.1, the latest version of the project at the moment, `cargo build --release` command) - here are the results about the binary size improvements.

| Binary\Build mode | dynamo-run | libdynamo_llm_capi.so | http | llmctl | metrics | mock_worker |
| --- | --- | --- | --- | --- | --- | --- |
| Release | 55 Mib | 14 Mib | 19 Mib | 14 Mib | 21 Mib | 14 Mib |
| Release + `codegen-units = 1` + ThinLTO | 43 Mib | 11 Mib | 15 Mib | 11 Mib | 17 Mib | 11 Mib |
| Release + `codegen-units = 1` + FatLTO | 38 Mib | 9.2 Mib | 13 Mib | 9.6 Mib | 15 Mib | 9.6 Mib |

#### Details:

Enable `codegen-units = 1` and Fat LTO for better optimizations.

#### Where should the reviewer start?

Just check the `Cargo.toml` file ;)

#### Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

- closes GitHub issue: #278

af8ee9db

fix(mistralrs): Disable paged attention (#234) · fd95f37b

Graham King authored Mar 19, 2025

Under load it sometimes drops a request. The request gets added to the batch (sequence) and immediately gets a FinishReason Stop. Not sure why. It doesn't happen with the default scheduler (non-paged attention), so switch to that for now.

fd95f37b

docs: Move back dynamo deploy file to the guides subfolder in docs (#295) · 48a59890
mohammedabdulwahhab authored Mar 19, 2025
```
Co-authored-by: mabdulwahhab <mabdulwahhab@nvidia.com>
```
48a59890
fix(dynamo-run): Fix build if llamacpp and mistralrs are disabled (#262) · 3ac95a90
Graham King authored Mar 19, 2025

3ac95a90

18 Mar, 2025 20 commits
- docs: proper installation steps + Ubuntu 24.04 support (#275) · ba33b2bd
  Dmitry Tokarev authored Mar 18, 2025
```
Co-authored-by: Anant Sharma <anants@nvidia.com>
```
  ba33b2bd
- docs: Update README.md - add missing python3-pip package (#263) · 004b6e6a
  Dmitry Tokarev authored Mar 18, 2025
  
  004b6e6a
- fix: update readme discord link (#271) · 16d0d60f
  ishandhanani authored Mar 18, 2025
```
Co-authored-by: Dmitry Tokarev <dtokarev@nvidia.com>
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  16d0d60f
- docs: dynamo serve guide (#270) · a5113e46
  ishandhanani authored Mar 18, 2025
```
Co-authored-by: Dmitry Tokarev <dtokarev@nvidia.com>
```
  a5113e46
- docs: Clean up of readme for deploying to K8s using helm (#266) · 610ef375
  mohammedabdulwahhab authored Mar 18, 2025
```
Co-authored-by: mabdulwahhab <mabdulwahhab@nvidia.com>
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  610ef375
- docs(dynamo-run): Move README into docs/guides/ , add Quickstart (#265) · 40c55a24
  Graham King authored Mar 18, 2025
  
  40c55a24
- feat: add local gpu allocation (#232) · 9f0181a8
  Biswa Panda authored Mar 18, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  9f0181a8
- docs: fix links in docs (#256) · 548578f4
  Dmitry Tokarev authored Mar 18, 2025
```
Co-authored-by: Anant Sharma <anants@nvidia.com>
```
  548578f4
- chore: remove dynamo from vllm whl version (#257) · 792b747c
  Anant Sharma authored Mar 18, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  792b747c
- fix: temporary documentation for crates.io (#255) · 1ccd4caa
  Harrison Saturley-Hall authored Mar 18, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  1ccd4caa
- Update README.md · 05d19c23
  Meenakshi Sharma authored Mar 17, 2025
  
  05d19c23
- fix: created documentation to deploy_to_k8s_using_helm (#245) · 3983830e
  Maksim Khadkevich authored Mar 17, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  3983830e
- fix: update default to dev (#251) · 93a46969
  Neelay Shah authored Mar 17, 2025
  
  93a46969
- docs: update guides and filenames (#252) · c2a6b368
  Suman Tatiraju authored Mar 17, 2025
  
  c2a6b368
- chore: rename patched vllm wheel to ai_dynamo_vllm (#250) · 5161250a
  Anant Sharma authored Mar 17, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  5161250a
- Update README.md · 8f9dcad4
  Meenakshi Sharma authored Mar 17, 2025
  
  8f9dcad4
- fix: more closely mimic perf analyzer location to previous behavior (#246) · 5e70dd60
  Harrison Saturley-Hall authored Mar 17, 2025
  
  5e70dd60
- Docs: Update README.md (#249) · 0ba0df4b
  Meenakshi Sharma authored Mar 17, 2025
  
  0ba0df4b
- docs: Discord banner (#248) · 708b1aae
  Meenakshi Sharma authored Mar 17, 2025
```
Co-authored-by: Nicolas Noble <nicolasnoble@users.noreply.github.com>
```
  708b1aae
- docs: add support matrix (#210) · 63974527
  Pavithra Vijayakrishnan authored Mar 17, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  63974527
17 Mar, 2025 7 commits
- docs: Adding Discord banner (#238) · f189b79c
  Nicolas Noble authored Mar 17, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  f189b79c
- fix: propogate env vars from input cli/yaml into process (#208) · a611726e
  ishandhanani authored Mar 17, 2025
  
  a611726e
- docs: add guides to docs (#243) · 9be75482
  Suman Tatiraju authored Mar 17, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  9be75482
- chore: Update README.md (#242) · eca57f66
  Neelay Shah authored Mar 17, 2025
  
  eca57f66
- docs: point to right sdk (#241) · 18ce1f9e
  ishandhanani authored Mar 17, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  18ce1f9e
- fix: kkranen re-codeowner (#240) · 29b2a7c5
  kkranen authored Mar 17, 2025
  
  29b2a7c5
- docs: add docs for SDK and CLI and how to use (#209) · b0f433ee
  ishandhanani authored Mar 17, 2025
```
Signed-off-by: ishandhanani <ishandhanani@gmail.com>
Co-authored-by: mabdulwahhab <mabdulwahhab@nvidia.com>
```
  b0f433ee