Commits · 7731b0245cf53eb81440d9dfeeb8c2dd91065f42 · OpenDAS / dynamo

23 Oct, 2025 1 commit
- chore: Use KeyValueStoreManager instead of etcd::Client (#3822) · 7731b024
  Graham King authored Oct 23, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  7731b024
22 Oct, 2025 1 commit
- feat: python gil release for radix tree + dump_tree_as_events in python (#3748) · 681951d4
  Michael Feil authored Oct 21, 2025
```
Signed-off-by: michaelfeil <me@michaelfeil.eu>
Co-authored-by: Yan Ru Pei <yanrpei@gmail.com>
```
  681951d4
21 Oct, 2025 1 commit
- feat: bake prefill router into frontend, supporting vllm for now (#3762) · e01c6e99
  Yan Ru Pei authored Oct 21, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  e01c6e99
17 Oct, 2025 1 commit
- chore: remove kv metrics scraping and aggregation (#3701) · 4c207e0c
  Yan Ru Pei authored Oct 17, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  4c207e0c
16 Oct, 2025 1 commit
- feat: dp rank routing (#3597) · f978f4d1
  Yan Ru Pei authored Oct 15, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  f978f4d1
13 Oct, 2025 1 commit
- feat: remove stale workers on snapshot + some refactoring (#3589) · b5e762b2
  Yan Ru Pei authored Oct 13, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  b5e762b2
11 Oct, 2025 1 commit

chore: consolidations of KvPushRouter bindings and usage examples (#3543) · c3fcfdd6

Yan Ru Pei authored Oct 10, 2025


Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: Yan Ru Pei <yanrpei@gmail.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

c3fcfdd6

04 Oct, 2025 1 commit
- feat: use KvPushRouter for prefill router (#3401) · 30610e73
  Yan Ru Pei authored Oct 03, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  30610e73
01 Oct, 2025 1 commit
- feat: make prefill router general (#3329) · 9b9536d0
  Yan Ru Pei authored Sep 30, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  9b9536d0
30 Sep, 2025 1 commit
- fix: python bindings for router should register to etcd as well (#3302) · d354763c
  Yan Ru Pei authored Sep 30, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  d354763c
26 Sep, 2025 1 commit

chore: bump vllm version to 0.10.2 (#3180) · 5bb74904

Alec authored Sep 26, 2025


Signed-off-by: Alec <aflowers@nvidia.com>
Signed-off-by: Alec <35311602+alec-flowers@users.noreply.github.com>
Signed-off-by: krishung5 <krish@nvidia.com>
Co-authored-by: Kris Hung <krish@nvidia.com>

5bb74904

22 Sep, 2025 1 commit
- feat: vllm prefill router (#3155) · 031590fc
  Yan Ru Pei authored Sep 22, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  031590fc
19 Sep, 2025 1 commit
- chore: Upgrade Rust to 1.90 (#3147) · 7a5a0bd6
  Graham King authored Sep 19, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  7a5a0bd6
17 Sep, 2025 1 commit
- fix: hook up worker removals for indexer (#3095) · 78a3feda
  Yan Ru Pei authored Sep 17, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  78a3feda
16 Sep, 2025 1 commit
- chore: Remove more extended Apache headers (#3063) · 723f2da7
  Graham King authored Sep 16, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  723f2da7
10 Sep, 2025 1 commit
- feat: adds kv indexer metrics (#2905) · 40000976
  blarson-b10 authored Sep 10, 2025
```
Signed-off-by: Brian Larson <brian.larson@baseten.co>
```
  40000976
03 Sep, 2025 1 commit
- feat: don't modify kv scheduler states on query + more python binding (#2798) · 383e3b3a
  Yan Ru Pei authored Sep 02, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  383e3b3a
30 Aug, 2025 1 commit
- feat: Router warm restarts via durable KV event consumers and radix snapshotting (#2756) · 488c8709
  Yan Ru Pei authored Aug 30, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  488c8709
29 Aug, 2025 1 commit
- feat: add Prometheus metrics integration for KvStats (#2704) · 15539fd0
  Keiven C authored Aug 28, 2025
```
Signed-off-by: Keiven C <213854356+keivenchang@users.noreply.github.com>
```
  15539fd0
25 Aug, 2025 1 commit
- feat: python bindings for the entire KvPushRouter + per-request router configs (#2658) · f08729ae
  Yan Ru Pei authored Aug 25, 2025
  
  f08729ae
21 Aug, 2025 1 commit
- feat: Add model label for vllm backend metrics (#2474) · 57728909
  Tzu-Ling Kan authored Aug 21, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  57728909
08 Jul, 2025 1 commit

feat: predictive active blocks for routing without load metrics (#1731) · 84e71e27

Yan Ru Pei authored Jul 08, 2025


Signed-off-by: Yan Ru Pei <yanrpei@gmail.com>
Co-authored-by: Alec <35311602+alec-flowers@users.noreply.github.com>

84e71e27

07 Jul, 2025 1 commit

feat: vllm speculative decoding metrics (#1549) · 439e977d

jain-ria authored Jul 07, 2025


Signed-off-by: jain-ria <riajain@NVIDIA.com>
Co-authored-by: Alec <35311602+alec-flowers@users.noreply.github.com>

439e977d

01 Jul, 2025 2 commits
- fix: Fix main (#1712) · 6365a015
  jthomson04 authored Jun 30, 2025
  
  6365a015
- feat: Approximate KV Routing (#1636) · aaf283bb
  jthomson04 authored Jun 30, 2025
  
  aaf283bb
30 Jun, 2025 1 commit

chore(dynamo-run): Refactor to library (#1687) · 92f06b0e

Graham King authored Jun 30, 2025

Move much of what was in the `dynamo-run` crate into `dynamo-llm` so that everyone can use it.

Example usage:

1. Create a `LocalModel`:

```
    let local_model = LocalModelBuilder::default()
	.model_path("Qwen/Qwen3-0.6B")
	.http_port(8080)
	.build().await?;
```

2. Make an engine:

```
    let engine_config = EngineConfig::StaticFull {
	engine: dynamo_engine_mistralrs::make_engine(&local_model).await?,
	model: Box::new(local_model),
    };
```

3. Connect it to an input and run it

```
    dynamo_llm::entrypoint::input::run_input(Input::Http, runtime, engine_config).await?;
```

For https://github.com/ai-dynamo/dynamo/issues/1647

Code Rabbit summary, thanks:
  * Introduced a flexible builder pattern for local model configuration, allowing advanced customization and easier initialization.
  * Added new input modes and unified input handling, supporting interactive chat, HTTP server, batch file, and distributed endpoint modes.
  * Centralized engine configuration and routing, enabling more extensible and maintainable engine management.
  * Simplified and modularized the codebase by moving input and engine logic into dedicated modules.
  * Replaced direct model construction with an asynchronous builder for improved clarity and extensibility.
  * Streamlined configuration and validation for flags and router settings.
  * Added validation to prevent incompatible input and output combinations in endpoint and dynamic modes.

92f06b0e

27 Jun, 2025 1 commit

feat: Unnormalize waiting requests + predictive load updates for Python router... · 8392e7a1

Yan Ru Pei authored Jun 27, 2025

feat: Unnormalize waiting requests + predictive load updates for Python router (mirroring Rust) + softmax sampling to reduce thrashing (#1638)

8392e7a1

14 Jun, 2025 1 commit

feat: Standalone Router (#1409) · 13a99b7f

Yan Ru Pei authored Jun 14, 2025


Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: Yan Ru Pei <yanrpei@gmail.com>
Signed-off-by: jain-ria <riajain@NVIDIA.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: jain-ria <riajain@NVIDIA.com>

13a99b7f

30 May, 2025 2 commits
- refactor: rename KvMetricsPublisher to WorkerMetricsPublisher (#1284) · 2f8da9ad
  Alec authored May 30, 2025
  
  2f8da9ad
- refactor: Refactor kv event publishers (#1287) · 9210a26d
  jthomson04 authored May 30, 2025
  
  9210a26d
29 May, 2025 2 commits
- fix: Renamed event publisher classes and configuration (#1273) · f67dc38b
  Alec authored May 29, 2025
  
  f67dc38b
- feat: add KV Event Publishing to vLLM v1 (#1181) · 0df6d462
  Alec authored May 29, 2025
  
  0df6d462
14 May, 2025 1 commit

feat(dynamo-run): KV-aware routing (#1064) · 29813508

Graham King authored May 14, 2025

Router:
```
dynamo-run in=http out=dyn://dynamo.endpoint.generate --router-mode kv
```

Worker (* N):
```
dynamo-run in=dyn://dynamo.endpoint.generate out=vllm /data/llms/Qwen/Qwen3-4B
```

You need patched vllm and the C bindings `.so`. Full docs in the updated guide: `docs/guides/dynamo_run.md`.

This gives us a pure-Rust ingress node: OpenAI compliant HTTP server + Pre-processor + KV-aware router.

29813508

08 May, 2025 1 commit
- refactor: use primary lease + self-contained graceful shutdown trigged by SIGINT/SIGTERM (#1001) · 466b8e5f
  Hongkuan Zhou authored May 08, 2025
  
  466b8e5f
21 Apr, 2025 1 commit
- feat: add custom lease to worker components (#748) · c392c341
  ishandhanani authored Apr 21, 2025
  
  c392c341
04 Apr, 2025 1 commit
- feat: KV recorder for dumping router events into a jsonl (#505) · 4b6cfc1b
  Yan Ru Pei authored Apr 04, 2025
  
  4b6cfc1b
02 Apr, 2025 1 commit
- feat: kv aware router executable (#399) · c4106e6a
  Ryan Olson authored Apr 02, 2025
  
  c4106e6a
17 Mar, 2025 1 commit
- feat: expose Python binding for KVEventPublisher. Use event pub/sub trait for KV events (#169) · 6e09681e
  GuanLuo authored Mar 17, 2025
  
  6e09681e
11 Mar, 2025 1 commit
- feat: add new metrics and simple router cost fn (#88) · 3f84cdad
  Alec authored Mar 11, 2025
  
  3f84cdad
09 Mar, 2025 1 commit
- feat: make block_size input for indexer, router, publisher (#66) · 989bb3d5
  Alec authored Mar 09, 2025
  
  989bb3d5