"lib/bindings/python/examples/error_handling/client.py" did not exist on "0bfd9a765e57608cdf0694f76a4aea38d59e1e8a"
- 07 May, 2025 3 commits
-
-
Graham King authored
Signed-off-by:
Graham King <graham@gkgk.org> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
祝健聪 authored
Signed-off-by:Chasing1020 <chasing1020@gmail.com>
-
Graham King authored
vllm and sglang are now the sub-process engines from #954 Also updated docs on doing vllm and sglang multi-gpu (tensor parallel) and multi-node (pipeline parallel).
-
- 06 May, 2025 3 commits
-
-
Hongkuan Zhou authored
-
Graham King authored
New vllm and sglang engines that run in a sub-process. Will hopefully replace the existing embedded python engines. Why? - Pure Python, does not require knowing Rust to work on it. Much simpler to maintain. - No embedded Python interpreter which avoids linking libpython and avoids the MacOS virtualenv issues. - Should have better performance as it's "native" vllm / sglang. - Works with any version of vllm (including v1!) and sglang. Less upgrade struggle. -
hhzhang16 authored
-
- 05 May, 2025 1 commit
-
-
julienmancuso authored
-
- 29 Apr, 2025 2 commits
-
-
mohammedabdulwahhab authored
Signed-off-by:
mohammedabdulwahhab <furkhan324@berkeley.edu> Co-authored-by:
hhzhang16 <54051230+hhzhang16@users.noreply.github.com>
-
Hongkuan Zhou authored
-
- 28 Apr, 2025 3 commits
-
-
Zhongdongming Dai authored
Co-authored-by:ishandhanani <82981111+ishandhanani@users.noreply.github.com>
-
ishandhanani authored
-
Hongkuan Zhou authored
Signed-off-by:
Hongkuan Zhou <tedzhouhk@gmail.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
- 26 Apr, 2025 2 commits
-
-
mohammedabdulwahhab authored
-
Hongkuan Zhou authored
Signed-off-by:
Hongkuan Zhou <tedzhouhk@gmail.com> Co-authored-by:
ishandhanani <82981111+ishandhanani@users.noreply.github.com> Co-authored-by:
ishandhanani <ishandhanani@gmail.com> Co-authored-by:
Ubuntu <ubuntu@dev-inst-2w1vokvyuts83rzn4n1k7mnzew9.us-central1-a.c.brevdevprod.internal> Co-authored-by:
Biswa Panda <biswa.panda@gmail.com> Co-authored-by:
Anant Sharma <anants@nvidia.com>
-
- 25 Apr, 2025 2 commits
-
-
julienmancuso authored
-
mohammedabdulwahhab authored
-
- 24 Apr, 2025 1 commit
-
-
ishandhanani authored
Co-authored-by:mohammedabdulwahhab <furkhan324@berkeley.edu>
-
- 23 Apr, 2025 2 commits
-
-
Ryan McCormick authored
-
julienmancuso authored
-
- 22 Apr, 2025 1 commit
-
-
julienmancuso authored
-
- 21 Apr, 2025 1 commit
-
-
Graham King authored
"echo_core" is an engine that echoes the post-processed request back to you so you can see the template. Good for testing. It needed an extra flag set to work correctly.
-
- 18 Apr, 2025 4 commits
-
-
Graham King authored
-
Graham King authored
It's different enough that I made a new engine vllm0_8 and renamed the previous engine to vllm0_7. `dynamo-run out=vllm` now expects 0.8. This matches the container change in #690. For older use `dynamo-run out=vllm0_7`.
-
mohammedabdulwahhab authored
-
mohammedabdulwahhab authored
-
- 15 Apr, 2025 3 commits
-
-
hhzhang16 authored
-
hhzhang16 authored
Signed-off-by:
hhzhang16 <54051230+hhzhang16@users.noreply.github.com> Co-authored-by:
mohammedabdulwahhab <furkhan324@berkeley.edu>
-
Maksim Khadkevich authored
Signed-off-by:Maksim Khadkevich <mkhadkevich@nvidia.com>
-
- 11 Apr, 2025 3 commits
-
-
mohammedabdulwahhab authored
-
Tanmay Verma authored
Signed-off-by:
Tanmay Verma <tanmay2592@gmail.com> Signed-off-by:
Tanmay Verma <tanmayv@nvidia.com> Co-authored-by:
Neelay Shah <neelays@nvidia.com>
-
hhzhang16 authored
Signed-off-by:
Jacky <18255193+kthui@users.noreply.github.com> Signed-off-by:
Pavithra Vijayakrishnan <160681768+pvijayakrish@users.noreply.github.com> Signed-off-by:
Chantal D Gama Rose <cdgamarose@nvidia.com> Signed-off-by:
hhzhang16 <54051230+hhzhang16@users.noreply.github.com> Co-authored-by:
Julien Mancuso <jmancuso@nvidia.com> Co-authored-by:
mohammedabdulwahhab <furkhan324@berkeley.edu> Co-authored-by:
mabdulwahhab <mabdulwahhab@nvidia.com> Co-authored-by:
Tushar Sharma <tusharma@nvidia.com> Co-authored-by:
Jacky <18255193+kthui@users.noreply.github.com> Co-authored-by:
Neelay Shah <neelays@nvidia.com> Co-authored-by:
Pavithra Vijayakrishnan <160681768+pvijayakrish@users.noreply.github.com> Co-authored-by:
cdgamarose-nv <cdgamarose@nvidia.com> Co-authored-by:
Anant Sharma <anants@nvidia.com> Co-authored-by:
julienmancuso <161955438+julienmancuso@users.noreply.github.com> Co-authored-by:
Suman Tatiraju <167138127+statiraju@users.noreply.github.com> Co-authored-by:
Piotr Marcinkiewicz <piotrm@nvidia.com> Co-authored-by:
ishandhanani <82981111+ishandhanani@users.noreply.github.com> Co-authored-by:
Tanmay Verma <tanmayv@nvidia.com>
-
- 09 Apr, 2025 3 commits
-
-
jon-chuang authored
feat: Extract Common Configs + Log Configs on Init + Add `test_` to `sdk/tests` filenames required for pytest (#434) Co-authored-by:ishandhanani <82981111+ishandhanani@users.noreply.github.com>
-
Tanmay Verma authored
-
cdgamarose-nv authored
#### Overview: Updated the dynamo run doc `docs/guides/dynamo_run.md` #### Details: - Updated the instructions to make it clear which binary to use for built backends - Reformatted the doc to make it more readable - Added missing cmake library for ubuntu Signed-off-by:Chantal D Gama Rose <cdgamarose@nvidia.com>
-
- 08 Apr, 2025 1 commit
-
-
Hongkuan Zhou authored
-
- 07 Apr, 2025 1 commit
-
-
tlipoca9 authored
Co-authored-by:ishandhanani <82981111+ishandhanani@users.noreply.github.com>
-
- 03 Apr, 2025 1 commit
-
-
Graham King authored
-
- 25 Mar, 2025 1 commit
-
-
Graham King authored
Put the arguments in a JSON file: ``` { "dtype": "half", "trust_remote_code": true } ``` Pass it like this: ``` dynamo-run out=sglang ~/llm_models/Llama-3.2-3B-Instruct --extra-engine-args sglang_extra.json ``` Requested here https://github.com/ai-dynamo/dynamo/issues/290 (`dtype`) and here https://github.com/ai-dynamo/dynamo/issues/360 (`trust_remote_code`).
-
- 24 Mar, 2025 1 commit
-
-
Graham King authored
This lets us do: ``` dynamo-run out=llamacpp <gguf_file> ``` Previously a `--model-config <hf-repo>` was also required, to configure our tokenizer.
-
- 21 Mar, 2025 1 commit
-
-
Olga Andreeva authored
Co-authored-by:Olga Andreeva <oandreeva@oandreeva-mlt.client.nvidia.com>
-