- 21 Aug, 2025 1 commit
-
-
Tzu-Ling Kan authored
Co-authored-by:Keiven Chang <keivenchang@users.noreply.github.com>
-
- 19 Aug, 2025 1 commit
-
-
Yan Ru Pei authored
-
- 14 Aug, 2025 1 commit
-
-
Jorge António authored
Co-authored-by:Yan Ru Pei <yanrpei@gmail.com>
-
- 07 Aug, 2025 1 commit
-
-
Yan Ru Pei authored
-
- 01 Aug, 2025 1 commit
-
-
Yan Ru Pei authored
-
- 28 Jul, 2025 1 commit
-
-
Yan Ru Pei authored
-
- 24 Jul, 2025 2 commits
-
-
Yan Ru Pei authored
Signed-off-by:Yan Ru Pei <yanrpei@gmail.com>
-
Yan Ru Pei authored
-
- 23 Jul, 2025 1 commit
-
-
Yan Ru Pei authored
Signed-off-by:Yan Ru Pei <yanrpei@gmail.com>
-
- 14 Jul, 2025 1 commit
-
-
Yan Ru Pei authored
-
- 10 Jul, 2025 3 commits
-
-
Yan Ru Pei authored
Signed-off-by:
Yan Ru Pei <yanrpei@gmail.com> Co-authored-by:
Hongkuan Zhou <tedzhouhk@gmail.com>
-
Alec authored
Signed-off-by:
Alec <35311602+alec-flowers@users.noreply.github.com> Co-authored-by:
ptarasiewiczNV <104908264+ptarasiewiczNV@users.noreply.github.com> Co-authored-by:
Hongkuan Zhou <tedzhouhk@gmail.com>
-
Yan Ru Pei authored
-
- 08 Jul, 2025 1 commit
-
-
Yan Ru Pei authored
Signed-off-by:
Yan Ru Pei <yanrpei@gmail.com> Co-authored-by:
Alec <35311602+alec-flowers@users.noreply.github.com>
-
- 07 Jul, 2025 1 commit
-
-
jain-ria authored
Signed-off-by:
jain-ria <riajain@NVIDIA.com> Co-authored-by:
Alec <35311602+alec-flowers@users.noreply.github.com>
-
- 01 Jul, 2025 2 commits
-
-
jthomson04 authored
-
jthomson04 authored
-
- 30 Jun, 2025 1 commit
-
-
Graham King authored
Move much of what was in the `dynamo-run` crate into `dynamo-llm` so that everyone can use it. Example usage: 1. Create a `LocalModel`: ``` let local_model = LocalModelBuilder::default() .model_path("Qwen/Qwen3-0.6B") .http_port(8080) .build().await?; ``` 2. Make an engine: ``` let engine_config = EngineConfig::StaticFull { engine: dynamo_engine_mistralrs::make_engine(&local_model).await?, model: Box::new(local_model), }; ``` 3. Connect it to an input and run it ``` dynamo_llm::entrypoint::input::run_input(Input::Http, runtime, engine_config).await?; ``` For https://github.com/ai-dynamo/dynamo/issues/1647 Code Rabbit summary, thanks: * Introduced a flexible builder pattern for local model configuration, allowing advanced customization and easier initialization. * Added new input modes and unified input handling, supporting interactive chat, HTTP server, batch file, and distributed endpoint modes. * Centralized engine configuration and routing, enabling more extensible and maintainable engine management. * Simplified and modularized the codebase by moving input and engine logic into dedicated modules. * Replaced direct model construction with an asynchronous builder for improved clarity and extensibility. * Streamlined configuration and validation for flags and router settings. * Added validation to prevent incompatible input and output combinations in endpoint and dynamic modes.
-
- 27 Jun, 2025 1 commit
-
-
Yan Ru Pei authored
feat: Unnormalize waiting requests + predictive load updates for Python router (mirroring Rust) + softmax sampling to reduce thrashing (#1638)
-
- 14 Jun, 2025 1 commit
-
-
Yan Ru Pei authored
Signed-off-by:
PeaBrane <yanrpei@gmail.com> Signed-off-by:
Yan Ru Pei <yanrpei@gmail.com> Signed-off-by:
jain-ria <riajain@NVIDIA.com> Co-authored-by:
coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by:
jain-ria <riajain@NVIDIA.com>
-
- 12 Jun, 2025 1 commit
-
-
Tianer Zhou authored
Signed-off-by:
Tianer Zhou <ezhoureal@gmail.com> Co-authored-by:
Yan Ru Pei <yanrpei@gmail.com>
-
- 02 Jun, 2025 1 commit
-
-
Hongkuan Zhou authored
-
- 30 May, 2025 3 commits
-
-
jain-ria authored
-
Alec authored
-
jthomson04 authored
-
- 29 May, 2025 2 commits
- 28 May, 2025 1 commit
-
-
Hongkuan Zhou authored
-
- 22 May, 2025 1 commit
-
-
Graham King authored
Removed the hard coded sleeps, explained what we're testing. Closes https://github.com/ai-dynamo/dynamo/issues/1132 The race condition is that `apply_event` sends a message on a channel, it does not directly apply the event. At some later point the tokio runtime schedules the task running the channel receiver, which applies the event. If that had not happened yet the test would fail.
-
- 15 May, 2025 1 commit
-
-
Graham King authored
The Python bindings use the default value for RouterMode. Previously that was Random (good), but now it became None (bad). Remove the option and clean up the duplicate RouterMode. I was trying to avoid putting the `KV` enum in dynamo-runtime. Turns out adding those two characters gives us a healthy simplification, and restores the old default router value. Also clean up two noisy log messages when waiting for KV routing metrics to start in worker.
-
- 14 May, 2025 1 commit
-
-
Graham King authored
Router: ``` dynamo-run in=http out=dyn://dynamo.endpoint.generate --router-mode kv ``` Worker (* N): ``` dynamo-run in=dyn://dynamo.endpoint.generate out=vllm /data/llms/Qwen/Qwen3-4B ``` You need patched vllm and the C bindings `.so`. Full docs in the updated guide: `docs/guides/dynamo_run.md`. This gives us a pure-Rust ingress node: OpenAI compliant HTTP server + Pre-processor + KV-aware router.
-
- 08 May, 2025 2 commits
-
-
Hongkuan Zhou authored
-
Yan Ru Pei authored
-
- 21 Apr, 2025 1 commit
-
-
ishandhanani authored
-
- 04 Apr, 2025 2 commits
-
-
Yan Ru Pei authored
-
Graham King authored
Also upgrade the cargo resolver to v3, the default. New clippy lints: - `next_back()` instead of `last()` for a double-ended iterator. That avoids walking the whole list. - ` repeat_n` instead of `repeat.take`. That avoids cloning. - Doc indenting
-
- 02 Apr, 2025 1 commit
-
-
Ryan Olson authored
-
- 31 Mar, 2025 1 commit
-
-
Tianer Zhou authored
Signed-off-by:Tianer Zhou <ezhoureal@gmail.com>
-
- 17 Mar, 2025 1 commit
-
-
GuanLuo authored
-
- 14 Mar, 2025 1 commit
-
-
Ryan McCormick authored
-