1. 19 Dec, 2025 2 commits
  2. 18 Dec, 2025 1 commit
  3. 25 Nov, 2025 1 commit
  4. 20 Nov, 2025 1 commit
  5. 13 Nov, 2025 1 commit
  6. 11 Nov, 2025 1 commit
  7. 08 Nov, 2025 1 commit
  8. 07 Nov, 2025 1 commit
  9. 27 Oct, 2025 1 commit
  10. 21 Oct, 2025 1 commit
  11. 07 Oct, 2025 1 commit
  12. 05 Sep, 2025 1 commit
  13. 03 Sep, 2025 1 commit
  14. 27 Aug, 2025 1 commit
  15. 22 Aug, 2025 1 commit
  16. 21 Aug, 2025 1 commit
  17. 19 Aug, 2025 2 commits
  18. 18 Aug, 2025 1 commit
  19. 15 Aug, 2025 1 commit
  20. 06 Aug, 2025 2 commits
  21. 05 Aug, 2025 1 commit
  22. 18 Jul, 2025 1 commit
  23. 30 Jun, 2025 1 commit
    • Graham King's avatar
      chore(dynamo-run): Refactor to library (#1687) · 92f06b0e
      Graham King authored
      Move much of what was in the `dynamo-run` crate into `dynamo-llm` so that everyone can use it.
      
      Example usage:
      
      1. Create a `LocalModel`:
      
      ```
          let local_model = LocalModelBuilder::default()
      	.model_path("Qwen/Qwen3-0.6B")
      	.http_port(8080)
      	.build().await?;
      ```
      
      2. Make an engine:
      
      ```
          let engine_config = EngineConfig::StaticFull {
      	engine: dynamo_engine_mistralrs::make_engine(&local_model).await?,
      	model: Box::new(local_model),
          };
      ```
      
      3. Connect it to an input and run it
      
      ```
          dynamo_llm::entrypoint::input::run_input(Input::Http, runtime, engine_config).await?;
      ```
      
      For https://github.com/ai-dynamo/dynamo/issues/1647
      
      Code Rabbit summary, thanks:
        * Introduced a flexible builder pattern for local model configuration, allowing advanced customization and easier initialization.
        * Added new input modes and unified input handling, supporting interactive chat, HTTP server, batch file, and distributed endpoint modes.
        * Centralized engine configuration and routing, enabling more extensible and maintainable engine management.
        * Simplified and modularized the codebase by moving input and engine logic into dedicated modules.
        * Replaced direct model construction with an asynchronous builder for improved clarity and extensibility.
        * Streamlined configuration and validation for flags and router settings.
        * Added validation to prevent incompatible input and output combinations in endpoint and dynamic modes.
      92f06b0e
  24. 26 Jun, 2025 1 commit
  25. 25 Jun, 2025 1 commit
  26. 12 Jun, 2025 1 commit
  27. 04 Jun, 2025 2 commits
  28. 02 Jun, 2025 1 commit
  29. 21 May, 2025 2 commits
  30. 19 May, 2025 1 commit
    • Graham King's avatar
      feat: Support multiple models on single ingress node (#1127) · aeb79e62
      Graham King authored
      We can now do this:
      
      - Node 1:
      
      ```
      dynamo-run in=http out=dyn
      ```
      
      - Node 2 and 3, two instances of component 'backend' in the nemotron_ultra pipeline:
      
      ```
      dynamo-run in=dyn://nemotron_ultra.backend.generate out=vllm /data/models/NemotronUltra
      ```
      
      - Node 4 and 5, two instances of the 'backend' component in nemotron_super pipeline:
      
      ```
      dynamo-run in=dyn://nemotron_super.backend.generate out=vllm /data/models/NemotronSuper
      ```
      
      The ingress node will discover all four instances and route correctly. We have been planning for this for a long time now.
      
      As part of this auto-discovery is now always `out=dyn`, with no extra URL parts. Previously it could only route to a single pipeline.
      
      Also:
      - Refactor endpoint / instance naming now that I understand them
      - Fix removing models when their instance stops.
      aeb79e62
  31. 15 May, 2025 1 commit
    • Graham King's avatar
      fix: Fix default RouterMode value (#1092) · 889ab67e
      Graham King authored
      The Python bindings use the default value for RouterMode. Previously that was Random (good), but now it became None (bad).
      
      Remove the option and clean up the duplicate RouterMode. I was trying to avoid putting the `KV` enum in dynamo-runtime. Turns out adding those two characters gives us a healthy simplification, and restores the old default router value.
      
      Also clean up two noisy log messages when waiting for KV routing metrics to start in worker.
      889ab67e
  32. 14 May, 2025 2 commits
  33. 07 May, 2025 1 commit
  34. 06 May, 2025 1 commit
    • Graham King's avatar
      feat(dynamo-run): vllm and sglang subprocess engines (#954) · 28fd481c
      Graham King authored
      New vllm and sglang engines that run in a sub-process. Will hopefully replace the existing embedded python engines.
          
      Why?
          
        - Pure Python, does not require knowing Rust to work on it. Much simpler to maintain.
        - No embedded Python interpreter which avoids linking libpython and avoids the MacOS virtualenv issues.
        - Should have better performance as it's "native" vllm / sglang.
        - Works with any version of vllm (including v1!) and sglang. Less upgrade struggle.
      28fd481c