Commits · 0086ebc630326cbf705328580aa8ce6dcf5a98d4 · OpenDAS / dynamo

01 May, 2025 1 commit
- fix: add dedicated llmapi config for trtllm disagg kv routing example (#916) · 0086ebc6
  Ziqi Fan authored Apr 30, 2025
  
  0086ebc6
28 Apr, 2025 1 commit

fix: change the processor number to 5 to reduce the tokenization bottleneck (#865) · 6630fa5c

richardhuo-nv authored Apr 28, 2025

We were observing a 40% performance drop compared with trtllm serve when benchmarking with isl=1000 and osl=200 at a concurrency level > 128.

The number of the tokenization worker is the bottleneck. After bumping the tokenization processors number to 5, dynamo's benchmarking perf could match the trtllm serve's perf.

6630fa5c

11 Apr, 2025 1 commit

feat: TRT-LLM disaggregated serving using UCX (#562) · da38e96a

Tanmay Verma authored Apr 10, 2025


Signed-off-by: Tanmay Verma <tanmay2592@gmail.com>
Signed-off-by: Tanmay Verma <tanmayv@nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>

da38e96a

10 Apr, 2025 1 commit
- docs: update dynamo serve trtllm agg example yaml files (#600) · 34be4418
  Ziqi Fan authored Apr 10, 2025
  
  34be4418
08 Apr, 2025 1 commit

chore: Update TRTLLM version. Fix router. (#527) · 7dca64df

Tanmay Verma authored Apr 07, 2025


Signed-off-by: Tanmay Verma <tanmay2592@gmail.com>
Co-authored-by: Dmitry Tokarev <dtokarev@nvidia.com>

7dca64df

03 Apr, 2025 1 commit
- feat: Add TensorRT-LLM example for dynamo serve/run (#456) · 6eb10540
  Tanmay Verma authored Apr 03, 2025
```
Co-authored-by: Neelay Shah <neelays@nvidia.com>
```
  6eb10540
19 Mar, 2025 1 commit
- feat: `Frontend` component uses served_model_name instead of model (#302) · 1f6ccc7f
  ishandhanani authored Mar 19, 2025
  
  1f6ccc7f
17 Mar, 2025 4 commits
- chore: move examples to top level (#220) · 4b1867c5
  Neelay Shah authored Mar 17, 2025
  
  4b1867c5
- revert: "moving examples to top level" (#218) · 21b795e8
  Anant Sharma authored Mar 17, 2025
  
  21b795e8
- moving examples to top level · 8891aa0c
  nnshah1 authored Mar 17, 2025
  
  8891aa0c
- chore: refactor examples and clean CLI (#195) · df51a622
  ishandhanani authored Mar 16, 2025
  
  df51a622
15 Mar, 2025 2 commits
- feat: add routerless processor based monolith example (#180) · dd238a26
  Biswa Panda authored Mar 15, 2025
  
  dd238a26
- feat(deploy): Add examples for dynamo serve (#173) · b4aff959
  Biswa Panda authored Mar 15, 2025
```
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>
```
  b4aff959
14 Mar, 2025 1 commit
- feat(sdk): add initial graph structure for prebuilt components (#130) · b8120504
  ishandhanani authored Mar 14, 2025
```
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
```
  b8120504