Commits · 4be72b6f6058ac3c4159148efc7ad750a2a64651 · OpenDAS / dynamo

12 Jan, 2026 1 commit
- feat: allow router not assuming decode kv reuse (#5350) · 7fdc742e
  Yan Ru Pei authored Jan 12, 2026
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  7fdc742e
10 Jan, 2026 1 commit
- fix: fix bug in diffing logic in list_and_watch (#5318) · 2f9812aa
  mohammedabdulwahhab authored Jan 10, 2026
```
Signed-off-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
```
  2f9812aa
03 Jan, 2026 1 commit
- feat: mockers with bootstrap optimization (sglang testing) + CI test (#5121) · 0980b27f
  Yan Ru Pei authored Jan 02, 2026
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  0980b27f
02 Jan, 2026 1 commit
- chore: update all copyright headers in repo to 2026 (#5130) · cf433e68
  Tushar Sharma authored Jan 02, 2026
```
Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
```
  cf433e68
31 Dec, 2025 1 commit

fix: sglang disagg routing fixes and optimizations [DYN-1692] (#5106) · 0b33c1df

Yan Ru Pei authored Dec 31, 2025


Signed-off-by: PeaBrane <yanrpei@gmail.com>
Co-authored-by: Ishan Dhanani <ishandhanani@gmail.com>
Co-authored-by: Sean SH Choi <sechoi@nvidia.com>
Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>

0b33c1df

25 Dec, 2025 1 commit
- feat: Book-keeping bindings [DEP-689] (#5036) · 01f77f2c
  atchernych authored Dec 25, 2025
```
Signed-off-by: Anna Tchernych <atchernych@nvidia.com>
```
  01f77f2c
19 Dec, 2025 2 commits
- feat: Request Migration Metrics (#5029) · e6a6a1f2
  Jacky authored Dec 19, 2025
```
Signed-off-by: Jacky <18255193+kthui@users.noreply.github.com>
```
  e6a6a1f2
- feat: add multimodal support to KV router with standalone trtllm example (#4577) · 481dc636
  zhongdaor-nv authored Dec 18, 2025
```
Signed-off-by: zhongdaor <zhongdaor@nvidia.com>
Signed-off-by: zhongdaor-nv <zhongdaor@nvidia.com>
```
  481dc636
18 Dec, 2025 2 commits
- feat: support disag serving in GAIE [DEP-659] (#4756) · 15b49818
  atchernych authored Dec 18, 2025
```
Signed-off-by: Anna Tchernych <atchernych@nvidia.com>
```
  15b49818
- feat(frontend): First part of Python request handling (#4999) · da0f2fb8
  Graham King authored Dec 17, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  da0f2fb8
11 Dec, 2025 1 commit
- feat: early rejection based on active prefill tokens (#4837) · 10b01b45
  Yan Ru Pei authored Dec 11, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  10b01b45
04 Dec, 2025 1 commit
- chore: no need to arc wrap client (#4741) · 5b24b429
  Yan Ru Pei authored Dec 03, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  5b24b429
02 Dec, 2025 1 commit
- feat: dynamic setting of thresholds for rejection (#4673) · 4c1bc4ee
  Yan Ru Pei authored Dec 02, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  4c1bc4ee
21 Nov, 2025 1 commit
- chore: merge KvIndexer and ApproxKvIndexer (#4500) · c61e0dd3
  Yan Ru Pei authored Nov 21, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  c61e0dd3
13 Nov, 2025 2 commits
- chore: better error handling in prefill router (#4286) · ce833983
  Yan Ru Pei authored Nov 13, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  ce833983
- feat: kv router should route to available instances (#4225) · 8379b0cd
  Yan Ru Pei authored Nov 12, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  8379b0cd
11 Nov, 2025 1 commit
- chore: Remove static mode (#4235) · e1af3af6
  Graham King authored Nov 11, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  e1af3af6
28 Oct, 2025 1 commit
- feat: mocker disagg (#3833) · cc4c3516
  Yan Ru Pei authored Oct 28, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  cc4c3516
23 Oct, 2025 2 commits

chore: Use KeyValueStoreManager instead of etcd::Client (#3822) · 7731b024
Graham King authored Oct 23, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
7731b024

feat: remove component parameter from EPP (#3831) · 6f9be594

atchernych authored Oct 23, 2025

Signed-off-by: Anna Tchernych <atchernych@nvidia.com>
Signed-off-by: atchernych <atchernych@nvidia.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

6f9be594

22 Oct, 2025 1 commit
- fix: Load deployment card from ModelExpress for EPP (#3793) · c8adbe6f
  atchernych authored Oct 22, 2025
```
Signed-off-by: Anna Tchernych <atchernych@nvidia.com>
```
  c8adbe6f
21 Oct, 2025 1 commit
- feat: bake prefill router into frontend, supporting vllm for now (#3762) · e01c6e99
  Yan Ru Pei authored Oct 21, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  e01c6e99
16 Oct, 2025 1 commit
- feat: dp rank routing (#3597) · f978f4d1
  Yan Ru Pei authored Oct 15, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  f978f4d1
02 Oct, 2025 1 commit
- fix: Adjust func signature to match main post-merge (#3357) · 91ba9026
  atchernych authored Oct 01, 2025
```
Signed-off-by: Anna Tchernych <atchernych@nvidia.com>
```
  91ba9026
01 Oct, 2025 1 commit
- feat: Create worker selection pipeline (#3080) · 5194acbd
  atchernych authored Oct 01, 2025
```
Signed-off-by: Anna Tchernych <atchernych@nvidia.com>
```
  5194acbd
16 Sep, 2025 1 commit
- chore: Remove more extended Apache headers (#3063) · 723f2da7
  Graham King authored Sep 16, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  723f2da7
22 Aug, 2025 1 commit
- chore: Rust to 1.89 and edition 2024 (#2659) · bce74588
  Graham King authored Aug 22, 2025
  
  bce74588
30 Jun, 2025 1 commit

chore(dynamo-run): Refactor to library (#1687) · 92f06b0e

Graham King authored Jun 30, 2025

Move much of what was in the `dynamo-run` crate into `dynamo-llm` so that everyone can use it.

Example usage:

1. Create a `LocalModel`:

```
    let local_model = LocalModelBuilder::default()
	.model_path("Qwen/Qwen3-0.6B")
	.http_port(8080)
	.build().await?;
```

2. Make an engine:

```
    let engine_config = EngineConfig::StaticFull {
	engine: dynamo_engine_mistralrs::make_engine(&local_model).await?,
	model: Box::new(local_model),
    };
```

3. Connect it to an input and run it

```
    dynamo_llm::entrypoint::input::run_input(Input::Http, runtime, engine_config).await?;
```

For https://github.com/ai-dynamo/dynamo/issues/1647

Code Rabbit summary, thanks:
  * Introduced a flexible builder pattern for local model configuration, allowing advanced customization and easier initialization.
  * Added new input modes and unified input handling, supporting interactive chat, HTTP server, batch file, and distributed endpoint modes.
  * Centralized engine configuration and routing, enabling more extensible and maintainable engine management.
  * Simplified and modularized the codebase by moving input and engine logic into dedicated modules.
  * Replaced direct model construction with an asynchronous builder for improved clarity and extensibility.
  * Streamlined configuration and validation for flags and router settings.
  * Added validation to prevent incompatible input and output combinations in endpoint and dynamic modes.

92f06b0e

30 May, 2025 1 commit
- refactor: Refactor kv event publishers (#1287) · 9210a26d
  jthomson04 authored May 30, 2025
  
  9210a26d
17 Mar, 2025 1 commit
- feat: expose Python binding for KVEventPublisher. Use event pub/sub trait for KV events (#169) · 6e09681e
  GuanLuo authored Mar 17, 2025
  
  6e09681e
13 Mar, 2025 1 commit
- build: add top level rust workspace (#137) · 3d292851
  Anant Sharma authored Mar 13, 2025
  
  3d292851
10 Mar, 2025 1 commit
- chore: update wheel name and reset versions (#73) · fc4da345
  Anant Sharma authored Mar 10, 2025
  
  fc4da345
09 Mar, 2025 2 commits
- feat: make block_size input for indexer, router, publisher (#66) · 989bb3d5
  Alec authored Mar 09, 2025
  
  989bb3d5
- chore: left over renaming (#67) · 678cffb4
  Neelay Shah authored Mar 09, 2025
```
Co-authored-by: Harrison Saturley-Hall <454891+saturley-hall@users.noreply.github.com>
Co-authored-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
```
  678cffb4
08 Mar, 2025 1 commit
- chore: rename dynamo (#44) · 602352ce
  Neelay Shah authored Mar 08, 2025
```
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
```
  602352ce
05 Mar, 2025 1 commit
- refactor: rename triton_distributed to dynemo (#22) · 1af7433b
  Neelay Shah authored Mar 05, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
```
  1af7433b
28 Feb, 2025 1 commit
- feat: TensorRT-LLM engine (#317) · 057f8f47
  Graham King authored Feb 28, 2025
```
Engine, `tio` support and docs.

Proof of concept / experimental.
```
  057f8f47
27 Feb, 2025 1 commit
- ci: build wheel from root directory (#274) · ea401e3b
  Anant Sharma authored Feb 27, 2025
  
  ea401e3b
26 Feb, 2025 1 commit
- refactor: using async_openai · 86aff237
  Paul Hendricks authored Feb 26, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
```
  86aff237
25 Feb, 2025 1 commit

feat: sglang backend for tio (#271) · e97493eb

Graham King authored Feb 25, 2025

- Setup venv

```
uv venv
source .venv/bin/activate
uv pip install pip
uv pip install sgl-kernel --force-reinstall --no-deps
uv pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/
```

- Build: `cargo build --release --features sglang`

- Run single node (make sure you're in the venv): `./tio out=sglang ~/llm_models/my_model`

- Run Deepseek multi-gpu / multi-node:

Node 1:
```
tio in=http out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 0 --dist-init-addr 10.217.98.122:9876
```

Node 2:
```
tio in=none out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 1 --dist-init-addr 10.217.98.122:9876
```

e97493eb