Commits · 0086ebc630326cbf705328580aa8ce6dcf5a98d4 · OpenDAS / dynamo

01 May, 2025 1 commit
- fix: add dedicated llmapi config for trtllm disagg kv routing example (#916) · 0086ebc6
  Ziqi Fan authored Apr 30, 2025
  
  0086ebc6
30 Apr, 2025 1 commit
- fix: trtllm example (#909) · 49517f2a
  Biswa Panda authored Apr 30, 2025
  
  49517f2a
29 Apr, 2025 1 commit

refactor: change trtllm example kv routing use python bindings | deal with... · 3c1c2ac3

Ziqi Fan authored Apr 28, 2025

refactor: change trtllm example kv routing use python bindings | deal with trtllm partial blocks | trtllm event change (#866)

3c1c2ac3

28 Apr, 2025 2 commits

fix: change the processor number to 5 to reduce the tokenization bottleneck (#865) · 6630fa5c

richardhuo-nv authored Apr 28, 2025

We were observing a 40% performance drop compared with trtllm serve when benchmarking with isl=1000 and osl=200 at a concurrency level > 128.

The number of the tokenization worker is the bottleneck. After bumping the tokenization processors number to 5, dynamo's benchmarking perf could match the trtllm serve's perf.

6630fa5c

feat: Add unified x86 / aarch64 (ARM) build for VLLM image (#839) · 566068dc
Ryan McCormick authored Apr 28, 2025

566068dc

24 Apr, 2025 2 commits
- chore: Increase sleep times from 2s -> 30s for startup logs (#807) · aae0d405
  Ryan McCormick authored Apr 23, 2025
  
  aae0d405
- fix: Update TRTLLM version and fix disagg workflow (#804) · 197105eb
  Tanmay Verma authored Apr 23, 2025
  
  197105eb
17 Apr, 2025 1 commit
- fix: direct clients vs dependancies (#704) · c30c6990
  ishandhanani authored Apr 16, 2025
```
Co-authored-by: Ziqi Fan <ziqif@nvidia.com>
```
  c30c6990
15 Apr, 2025 1 commit
- fix: set correct parent_hash for each kv block when publish kv events (#671) · 15455dff
  Ziqi Fan authored Apr 14, 2025
  
  15455dff
12 Apr, 2025 1 commit
- fix: change trtllm kv_router default block_size to 32 (#642) · 8edd23dc
  Ziqi Fan authored Apr 12, 2025
  
  8edd23dc
11 Apr, 2025 3 commits
- docs: Add documentation for UCX KV cache transfer in TRTLLM (#639) · 8d35dc43
  Tanmay Verma authored Apr 11, 2025
  
  8d35dc43
- docs: Add instructions to install git lfs (#627) · ee986548
  Tanmay Verma authored Apr 11, 2025
  
  ee986548
- feat: TRT-LLM disaggregated serving using UCX (#562) · da38e96a
  Tanmay Verma authored Apr 10, 2025
```
Signed-off-by: Tanmay Verma <tanmay2592@gmail.com>
Signed-off-by: Tanmay Verma <tanmayv@nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>
```
  da38e96a
10 Apr, 2025 1 commit
- docs: update dynamo serve trtllm agg example yaml files (#600) · 34be4418
  Ziqi Fan authored Apr 10, 2025
  
  34be4418
09 Apr, 2025 1 commit
- docs: Move trtllm dynamo run doc from example to dynamo run guide (#578) · 0186aa7b
  Tanmay Verma authored Apr 09, 2025
  
  0186aa7b
08 Apr, 2025 1 commit

chore: Update TRTLLM version. Fix router. (#527) · 7dca64df

Tanmay Verma authored Apr 07, 2025


Signed-off-by: Tanmay Verma <tanmay2592@gmail.com>
Co-authored-by: Dmitry Tokarev <dtokarev@nvidia.com>

7dca64df

07 Apr, 2025 2 commits
- docs: update close-deployment in dynamo_serve.md (#535) · df54b9cb
  tlipoca9 authored Apr 08, 2025
```
Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>
```
  df54b9cb
- fix: mypy error (#543) · 6eb31507
  ishandhanani authored Apr 07, 2025
```
Co-authored-by: finofliu <finofliu@tencent.com>
```
  6eb31507
04 Apr, 2025 1 commit
- fix: broken link to dynamo run (#517) · bd8f0804
  Kyungmin Lee authored Apr 04, 2025
  
  bd8f0804
03 Apr, 2025 1 commit
- feat: Add TensorRT-LLM example for dynamo serve/run (#456) · 6eb10540
  Tanmay Verma authored Apr 03, 2025
```
Co-authored-by: Neelay Shah <neelays@nvidia.com>
```
  6eb10540