"...git@developer.sourcefind.cn:kecinstone/2024-pra-vllm.git" did not exist on "923797fea4d80a4dac4409ece3c450b84d5ba001"
  1. 07 Apr, 2025 1 commit
    • Graham King's avatar
      feat(dynamo-run): Basic routing choice (#524) · ec2e7307
      Graham King authored
      As a first step towards KV routing:
      - introduce a `--router-mode` in dynamo-run that only does random and round-robin right now. Not that interesting yet.
      - Make the vllm engine publish the KV events received from our patched vllm.
      
      Now we "just" need to connect the two. Easy right?
      ec2e7307
  2. 14 Mar, 2025 1 commit
  3. 08 Mar, 2025 1 commit
  4. 07 Mar, 2025 1 commit
  5. 05 Mar, 2025 1 commit
  6. 27 Feb, 2025 2 commits
  7. 25 Feb, 2025 2 commits
  8. 21 Feb, 2025 1 commit
    • Graham King's avatar
      feat(tio): Distributed inference! (#235) · 32a748e4
      Graham King authored
      Add support in tio for distributed components and discovery.
      
      Node 1:
      ```
      tio in=http out=tdr://ns/backend/mistralrs
      ```
      
      Node 2:
      ```
      tio in=tdr://ns/backend/mistralrs out=mistralrs ~/llm_models/Llama-3.2-3B-Instruct
      ```
      
      This will use etcd to auto-discover the model and NATS to talk to it. You can run multiple workers on the same endpoint and it will pick one at random each time.
      
      The `ns/backend/mistralrs` are purely symbolic, pick anything as long as it has three parts, and it matches the other node.
      32a748e4
  9. 18 Feb, 2025 1 commit
  10. 12 Feb, 2025 1 commit
  11. 05 Feb, 2025 1 commit
  12. 04 Feb, 2025 1 commit