- 31 Oct, 2025 1 commit
-
-
milesial authored
Signed-off-by:Alexandre Milesi <milesial@users.noreply.github.com>
-
- 29 Oct, 2025 1 commit
-
-
Ayush Agarwal authored
Signed-off-by:
ayushag <ayushag@nvidia.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
- 27 Oct, 2025 1 commit
-
-
milesial authored
Signed-off-by:Alexandre Milesi <30204471+milesial@users.noreply.github.com>
-
- 14 Oct, 2025 1 commit
-
-
zhongdaor-nv authored
Signed-off-by:zhongdaor <zhongdaor@nvidia.com>
-
- 07 Oct, 2025 1 commit
-
-
Graham King authored
Signed-off-by:Graham King <grahamk@nvidia.com>
-
- 01 Oct, 2025 2 commits
-
-
Jacky authored
Signed-off-by:Jacky <18255193+kthui@users.noreply.github.com>
-
Ayush Agarwal authored
Signed-off-by:ayushag <ayushag@nvidia.com>
-
- 30 Sep, 2025 1 commit
-
-
ryan-lempka authored
Signed-off-by:Ryan Lempka <rlempka@nvidia.com>
-
- 23 Sep, 2025 1 commit
-
-
Ryan Olson authored
Signed-off-by:ayushag <ayushag@nvidia.com>
-
- 18 Sep, 2025 2 commits
-
-
zhongdaor-nv authored
feat: enhance GPT OSS frontend with improved harmony tool calling parser and reasoning parser (#2999) Signed-off-by:zhongdaor <zhongdaor@nvidia.com>
-
Ayush Agarwal authored
Signed-off-by:ayushag <ayushag@nvidia.com>
-
- 17 Sep, 2025 1 commit
-
-
Ayush Agarwal authored
Signed-off-by:
ayushag <ayushag@nvidia.com> Signed-off-by:
Graham King <grahamk@nvidia.com> Co-authored-by:
Graham King <grahamk@nvidia.com>
-
- 16 Sep, 2025 1 commit
-
-
Ryan Olson authored
Signed-off-by:Ryan Olson <rolson@nvidia.com>
-
- 15 Sep, 2025 1 commit
-
-
Elyas Mehtabuddin authored
Signed-off-by:
ayushag <ayushag@nvidia.com> Signed-off-by:
Biswa Panda <biswa.panda@gmail.com> Co-authored-by:
ayushag <ayushag@nvidia.com> Co-authored-by:
Biswa Panda <biswa.panda@gmail.com>
-
- 05 Sep, 2025 1 commit
-
-
Graham King authored
Signed-off-by:Graham King <grahamk@nvidia.com>
-
- 03 Sep, 2025 1 commit
-
-
Olga Andreeva authored
refactor: Split ModelType to ModelInput for request and response type; ModelType for the supported workloads (#2714) Signed-off-by:
Guan Luo <gluo@nvidia.com> Signed-off-by:
GuanLuo <41310872+GuanLuo@users.noreply.github.com> Co-authored-by:
Guan Luo <gluo@nvidia.com> Co-authored-by:
GuanLuo <41310872+GuanLuo@users.noreply.github.com>
-
- 02 Sep, 2025 1 commit
-
-
Graham King authored
Signed-off-by:Graham King <grahamk@nvidia.com>
-
- 28 Aug, 2025 1 commit
-
-
atchernych authored
-
- 26 Aug, 2025 1 commit
-
-
Chi McIsaac authored
-
- 25 Aug, 2025 1 commit
-
-
nachiketb-nvidia authored
-
- 22 Aug, 2025 2 commits
-
-
Graham King authored
-
Ayush Agarwal authored
-
- 19 Aug, 2025 2 commits
-
-
nachiketb-nvidia authored
Co-authored-by:Graham King <grahamk@nvidia.com>
-
atchernych authored
Co-authored-by:Biswa Panda <biswa.panda@gmail.com>
-
- 14 Aug, 2025 1 commit
-
-
Greg Clark authored
Signed-off-by:Greg Clark <grclark@nvidia.com>
-
- 11 Aug, 2025 1 commit
-
-
Graham King authored
-
- 07 Aug, 2025 1 commit
-
-
Graham King authored
-
- 23 Jul, 2025 1 commit
-
-
Biswa Panda authored
-
- 10 Jul, 2025 1 commit
-
-
Graham King authored
-
- 03 Jul, 2025 1 commit
-
-
Tom O'Brien authored
-
- 27 Jun, 2025 1 commit
-
-
Muthuraj Ramalingakumar authored
-
- 26 Jun, 2025 1 commit
-
-
Paul Hendricks authored
-
- 25 Jun, 2025 1 commit
-
-
ishandhanani authored
Co-authored-by:Ryan McCormick <rmccormick@nvidia.com>
-
- 24 Jun, 2025 1 commit
-
-
Paul Hendricks authored
-
- 11 Jun, 2025 1 commit
-
-
Hongkuan Zhou authored
-
- 04 Jun, 2025 1 commit
-
-
Paul Hendricks authored
-
- 03 Jun, 2025 1 commit
-
-
Hongkuan Zhou authored
Signed-off-by:
Hongkuan Zhou <tedzhouhk@gmail.com> Co-authored-by:
jothomson <jwillthomson19@gmail.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
-
- 02 Jun, 2025 1 commit
-
-
Graham King authored
It was confusing to have two names for one type. This tidy up started in #1064 , is now complete.
-
- 29 May, 2025 1 commit
-
-
Hongkuan Zhou authored
Signed-off-by:
Hongkuan Zhou <tedzhouhk@gmail.com> Co-authored-by:
coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
-
- 25 Apr, 2025 1 commit
-
-
Graham King authored
This will allow an ingress-side pre-processor to see it without needing a model checkout. Currently pre-processing is done in the worker, which has access to the model deployment card ("MDC") files (`config.json`, `tokenizer.json` and `tokenizer_config.json`) locally. We want to move the pre-processor to the ingress side to support KV routing. That requires ingress side (i.e the HTTP server), on a different machine than the worker to be able to see those three files. To support that this PR makes the worker upload the contents of those files to the NATS object store, and publishes the MDC with those NATS urls to the key-value store. The key-value store has an interface so any store (nats, etcd, redis, etc) can be supported. Implementations for memory and NATS are provided. Fetching the MDC from the store, doing pre-processing ingress side, and publishing a card backed by a GGUF, are all for a later commit. Part of #743
-