- 03 Mar, 2025 3 commits
-
-
Biswa Ranjan Panda authored
feat: Add compound AI python SDK See merge request dl/triton/triton-distributed!5
-
Biswa Ranjan Panda authored
-
Graham King authored
`cargo build --locked` won't let you use "1.85.0" if you only have "stable" installed, even if those are the same thing right now.
-
- 02 Mar, 2025 4 commits
-
-
Piotr Marcinkiewicz authored
-
Ryan McCormick authored
-
Neelay Shah authored
Signed-off-by:Neelay Shah <neelays@nvidia.com>
-
Alec authored
-
- 01 Mar, 2025 1 commit
-
-
Piotr Marcinkiewicz authored
-
- 28 Feb, 2025 8 commits
-
-
Paul Hendricks authored
-
Graham King authored
Engine, `tio` support and docs. Proof of concept / experimental.
-
Alec authored
Co-authored-by:Ryan McCormick <rmccormick@nvidia.com>
-
Ryan McCormick authored
-
Graham King authored
triton-distributed-llm component and support in tio
-
Harrison Saturley-Hall authored
Signed-off-by:
Harrison Saturley-Hall <454891+saturley-hall@users.noreply.github.com> Signed-off-by:
Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com> Co-authored-by:
Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
-
Piotr Marcinkiewicz authored
-
NVShreyas authored
-
- 27 Feb, 2025 12 commits
-
-
Graham King authored
Docs in README
-
Paul Hendricks authored
-
Paul Hendricks authored
-
Ryan Olson authored
-
ptarasiewiczNV authored
-
ptarasiewiczNV authored
Co-authored-by:
Piotr Tarasiewicz Nvidia <ptarasiewicznv@Piotrs-MacBook-Pro.local> Co-authored-by:
nnshah1 <neelays@nvidia.com> Co-authored-by:
alec-flowers <aflowers@nvidia.com>
-
Anant Sharma authored
-
Paul Hendricks authored
-
Paul Hendricks authored
-
Paul Hendricks authored
-
Tanmay Verma authored
Co-authored-by:NVShreyas <158103197+NVShreyas@users.noreply.github.com>
-
Sean SH Choi authored
Co-authored-by:Alec <35311602+alec-flowers@users.noreply.github.com>
-
- 26 Feb, 2025 9 commits
-
-
Ryan McCormick authored
-
Paul Hendricks authored
Co-authored-by:Graham King <grahamk@nvidia.com>
-
Ryan McCormick authored
-
Ryan McCormick authored
Co-authored-by:Ryan Olson <rolson@nvidia.com>
-
Piotr Marcinkiewicz authored
-
Graham King authored
This means we don't need to explain the parts to the users until they are ready. We use what they provide and default the rest. Allows all of this and more: - `tio out=tdr://test` - `tio out=tdr://llama_8b_pool` - `tio in=tdr://corp_ai_research_group/model_next-20250226` - `tio out=tdr://AIRE.NIM.migrate.mistralrs.1802` Python, API, etc all untouched.
-
Anant Sharma authored
-
Piotr Marcinkiewicz authored
Signed-off-by:Piotr Marcinkiewicz <piotrm@nvidia.com>
-
Alec authored
-
- 25 Feb, 2025 3 commits
-
-
Graham King authored
- Setup venv ``` uv venv source .venv/bin/activate uv pip install pip uv pip install sgl-kernel --force-reinstall --no-deps uv pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/ ``` - Build: `cargo build --release --features sglang` - Run single node (make sure you're in the venv): `./tio out=sglang ~/llm_models/my_model` - Run Deepseek multi-gpu / multi-node: Node 1: ``` tio in=http out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 0 --dist-init-addr 10.217.98.122:9876 ``` Node 2: ``` tio in=none out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llama-70B/ --tensor-parallel-size 8 --num-nodes 2 --node-rank 1 --dist-init-addr 10.217.98.122:9876 ```
-
Neelay Shah authored
-
Alec authored
Co-authored-by:aflowers <aflowers@nvidia.com>
-