- 07 Mar, 2025 1 commit
-
-
Graham King authored
There are two etcd keys: - The service - The model The second one is the interesting one for us. Previously we confused the two.
-
- 05 Mar, 2025 1 commit
-
-
Neelay Shah authored
Co-authored-by:Graham King <grahamk@nvidia.com>
-
- 27 Feb, 2025 2 commits
-
-
Paul Hendricks authored
-
Paul Hendricks authored
-
- 25 Feb, 2025 2 commits
-
-
Alec authored
Co-authored-by:aflowers <aflowers@nvidia.com>
-
Neelay Shah authored
Signed-off-by:
Neelay Shah <neelays@nvidia.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
- 21 Feb, 2025 1 commit
-
-
Graham King authored
Add support in tio for distributed components and discovery. Node 1: ``` tio in=http out=tdr://ns/backend/mistralrs ``` Node 2: ``` tio in=tdr://ns/backend/mistralrs out=mistralrs ~/llm_models/Llama-3.2-3B-Instruct ``` This will use etcd to auto-discover the model and NATS to talk to it. You can run multiple workers on the same endpoint and it will pick one at random each time. The `ns/backend/mistralrs` are purely symbolic, pick anything as long as it has three parts, and it matches the other node.
-
- 18 Feb, 2025 1 commit
-
-
Ryan Olson authored
Co-authored-by:Ryan McCormick <rmccormick@nvidia.com>
-
- 12 Feb, 2025 1 commit
-
-
Ryan Olson authored
Signed-off-by:
Ryan Olson <ryanolson@users.noreply.github.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
- 05 Feb, 2025 1 commit
-
-
J Wyman authored
-
- 04 Feb, 2025 1 commit
-
-
Ryan Olson authored
the journey begins
-