Commits · 0086ebc630326cbf705328580aa8ce6dcf5a98d4 · OpenDAS / dynamo

"docs/guides/dynamo_run.md" did not exist on "46ed649cb3bdf9fd8526036d291ae4b95cc1ce58"

01 May, 2025 1 commit
- fix: add dedicated llmapi config for trtllm disagg kv routing example (#916) · 0086ebc6
  Ziqi Fan authored Apr 30, 2025
  
  0086ebc6
28 Apr, 2025 1 commit

fix: change the processor number to 5 to reduce the tokenization bottleneck (#865) · 6630fa5c

richardhuo-nv authored Apr 28, 2025

We were observing a 40% performance drop compared with trtllm serve when benchmarking with isl=1000 and osl=200 at a concurrency level > 128.

The number of the tokenization worker is the bottleneck. After bumping the tokenization processors number to 5, dynamo's benchmarking perf could match the trtllm serve's perf.

6630fa5c

11 Apr, 2025 1 commit

feat: TRT-LLM disaggregated serving using UCX (#562) · da38e96a

Tanmay Verma authored Apr 10, 2025


Signed-off-by: Tanmay Verma <tanmay2592@gmail.com>
Signed-off-by: Tanmay Verma <tanmayv@nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>

da38e96a

10 Apr, 2025 1 commit
- docs: update dynamo serve trtllm agg example yaml files (#600) · 34be4418
  Ziqi Fan authored Apr 10, 2025
  
  34be4418
08 Apr, 2025 1 commit

chore: Update TRTLLM version. Fix router. (#527) · 7dca64df

Tanmay Verma authored Apr 07, 2025


Signed-off-by: Tanmay Verma <tanmay2592@gmail.com>
Co-authored-by: Dmitry Tokarev <dtokarev@nvidia.com>

7dca64df

03 Apr, 2025 1 commit
- feat: Add TensorRT-LLM example for dynamo serve/run (#456) · 6eb10540
  Tanmay Verma authored Apr 03, 2025
```
Co-authored-by: Neelay Shah <neelays@nvidia.com>
```
  6eb10540