1. 12 Jan, 2026 1 commit
  2. 10 Jan, 2026 1 commit
  3. 03 Jan, 2026 1 commit
  4. 02 Jan, 2026 1 commit
  5. 31 Dec, 2025 1 commit
  6. 25 Dec, 2025 1 commit
  7. 19 Dec, 2025 2 commits
  8. 18 Dec, 2025 2 commits
  9. 11 Dec, 2025 1 commit
  10. 04 Dec, 2025 1 commit
  11. 02 Dec, 2025 1 commit
  12. 21 Nov, 2025 1 commit
  13. 13 Nov, 2025 2 commits
  14. 11 Nov, 2025 1 commit
  15. 28 Oct, 2025 1 commit
  16. 23 Oct, 2025 2 commits
  17. 22 Oct, 2025 1 commit
  18. 21 Oct, 2025 1 commit
  19. 16 Oct, 2025 1 commit
  20. 02 Oct, 2025 1 commit
  21. 01 Oct, 2025 1 commit
  22. 16 Sep, 2025 1 commit
  23. 22 Aug, 2025 1 commit
  24. 30 Jun, 2025 1 commit
    • Graham King's avatar
      chore(dynamo-run): Refactor to library (#1687) · 92f06b0e
      Graham King authored
      Move much of what was in the `dynamo-run` crate into `dynamo-llm` so that everyone can use it.
      
      Example usage:
      
      1. Create a `LocalModel`:
      
      ```
          let local_model = LocalModelBuilder::default()
      	.model_path("Qwen/Qwen3-0.6B")
      	.http_port(8080)
      	.build().await?;
      ```
      
      2. Make an engine:
      
      ```
          let engine_config = EngineConfig::StaticFull {
      	engine: dynamo_engine_mistralrs::make_engine(&local_model).await?,
      	model: Box::new(local_model),
          };
      ```
      
      3. Connect it to an input and run it
      
      ```
          dynamo_llm::entrypoint::input::run_input(Input::Http, runtime, engine_config).await?;
      ```
      
      For https://github.com/ai-dynamo/dynamo/issues/1647
      
      Code Rabbit summary, thanks:
        * Introduced a flexible builder pattern for local model configuration, allowing advanced customization and easier initialization.
        * Added new input modes and unified input handling, supporting interactive chat, HTTP server, batch file, and distributed endpoint modes.
        * Centralized engine configuration and routing, enabling more extensible and maintainable engine management.
        * Simplified and modularized the codebase by moving input and engine logic into dedicated modules.
        * Replaced direct model construction with an asynchronous builder for improved clarity and extensibility.
        * Streamlined configuration and validation for flags and router settings.
        * Added validation to prevent incompatible input and output combinations in endpoint and dynamic modes.
      92f06b0e
  25. 30 May, 2025 1 commit
  26. 17 Mar, 2025 1 commit
  27. 13 Mar, 2025 1 commit
  28. 10 Mar, 2025 1 commit
  29. 09 Mar, 2025 2 commits
  30. 08 Mar, 2025 1 commit
  31. 05 Mar, 2025 1 commit
  32. 28 Feb, 2025 1 commit
  33. 27 Feb, 2025 1 commit
  34. 26 Feb, 2025 1 commit
  35. 25 Feb, 2025 1 commit
    • Graham King's avatar
      feat: sglang backend for tio (#271) · e97493eb
      Graham King authored
      - Setup venv
      
      ```
      uv venv
      source .venv/bin/activate
      uv pip install pip
      uv pip install sgl-kernel --force-reinstall --no-deps
      uv pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/
      ```
      
      - Build: `cargo build --release --features sglang`
      
      - Run single node (make sure you're in the venv): `./tio out=sglang ~/llm_models/my_model`
      
      - Run Deepseek multi-gpu / multi-node:
      
      Node 1:
      ```
      tio in=http out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 0 --dist-init-addr 10.217.98.122:9876
      ```
      
      Node 2:
      ```
      tio in=none out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 1 --dist-init-addr 10.217.98.122:9876
      ```
      e97493eb