1. 27 Oct, 2025 1 commit
  2. 21 Oct, 2025 1 commit
  3. 13 Oct, 2025 1 commit
  4. 30 Sep, 2025 1 commit
  5. 24 Sep, 2025 1 commit
  6. 05 Sep, 2025 1 commit
  7. 02 Sep, 2025 1 commit
  8. 22 Aug, 2025 1 commit
  9. 20 Aug, 2025 1 commit
  10. 18 Aug, 2025 1 commit
  11. 13 Aug, 2025 1 commit
  12. 07 Aug, 2025 1 commit
  13. 30 Jul, 2025 1 commit
  14. 28 Jul, 2025 1 commit
  15. 22 Jul, 2025 1 commit
  16. 16 Jul, 2025 1 commit
  17. 08 Jul, 2025 1 commit
  18. 07 Jul, 2025 1 commit
  19. 03 Jul, 2025 1 commit
  20. 13 Jun, 2025 1 commit
  21. 29 May, 2025 1 commit
  22. 09 May, 2025 2 commits
  23. 25 Apr, 2025 2 commits
    • Harrison Saturley-Hall's avatar
    • Graham King's avatar
      chore: Publish Model Deployment Card to NATS (#799) · d346782c
      Graham King authored
      This will allow an ingress-side pre-processor to see it without needing a model checkout.
      
      Currently pre-processing is done in the worker, which has access to the model deployment card ("MDC") files (`config.json`, `tokenizer.json` and `tokenizer_config.json`) locally. We want to move the pre-processor to the ingress side to support KV routing. That requires ingress side (i.e the HTTP server), on a different machine than the worker to be able to see those three files.
      
      To support that this PR makes the worker upload the contents of those files to the NATS object store, and publishes the MDC with those NATS urls to the key-value store. 
      
      The key-value store has an interface so any store (nats, etcd, redis, etc) can be supported. Implementations for memory and NATS are provided.
      
      Fetching the MDC from the store, doing pre-processing ingress side, and publishing a card backed by a GGUF, are all for a later commit.
      
      Part of #743 
      d346782c
  24. 09 Apr, 2025 1 commit
  25. 31 Mar, 2025 1 commit
  26. 19 Mar, 2025 1 commit
    • Graham King's avatar
      chore: Don't depend on openssl (#292) · 7c3fd5c9
      Graham King authored
      This makes the Rust parts all use ring / rustls library instead of local install of openssl. It's a step on the journey to being statically linked.
      
      Pieces:
      - `tokenizers` and `mistralrs` now support rustls (mistralrs by default, tokenizers with feature flag).
      - Move shared dependencies up into workspace
      - New `rand` crate has some renames for future rust
      - Ensure the dependency doesn't creep back in by enforcing it with cargo deny.
      7c3fd5c9
  27. 14 Mar, 2025 1 commit
  28. 13 Mar, 2025 1 commit
  29. 11 Mar, 2025 1 commit
  30. 10 Mar, 2025 1 commit
  31. 08 Mar, 2025 1 commit
  32. 05 Mar, 2025 1 commit
  33. 28 Feb, 2025 1 commit
  34. 27 Feb, 2025 1 commit
  35. 26 Feb, 2025 2 commits
  36. 25 Feb, 2025 2 commits
    • Graham King's avatar
      feat: sglang backend for tio (#271) · e97493eb
      Graham King authored
      - Setup venv
      
      ```
      uv venv
      source .venv/bin/activate
      uv pip install pip
      uv pip install sgl-kernel --force-reinstall --no-deps
      uv pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/
      ```
      
      - Build: `cargo build --release --features sglang`
      
      - Run single node (make sure you're in the venv): `./tio out=sglang ~/llm_models/my_model`
      
      - Run Deepseek multi-gpu / multi-node:
      
      Node 1:
      ```
      tio in=http out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 0 --dist-init-addr 10.217.98.122:9876
      ```
      
      Node 2:
      ```
      tio in=none out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 1 --dist-init-addr 10.217.98.122:9876
      ```
      e97493eb
    • Alec's avatar
      b760c569