- 30 Sep, 2025 1 commit
-
-
Graham King authored
Signed-off-by:Graham King <grahamk@nvidia.com>
-
- 17 Sep, 2025 1 commit
-
-
Graham King authored
Signed-off-by:Graham King <grahamk@nvidia.com>
-
- 03 Sep, 2025 1 commit
-
-
Olga Andreeva authored
refactor: Split ModelType to ModelInput for request and response type; ModelType for the supported workloads (#2714) Signed-off-by:
Guan Luo <gluo@nvidia.com> Signed-off-by:
GuanLuo <41310872+GuanLuo@users.noreply.github.com> Co-authored-by:
Guan Luo <gluo@nvidia.com> Co-authored-by:
GuanLuo <41310872+GuanLuo@users.noreply.github.com>
-
- 22 Aug, 2025 1 commit
-
-
Graham King authored
-
- 14 Aug, 2025 1 commit
-
-
Jorge António authored
Co-authored-by:Yan Ru Pei <yanrpei@gmail.com>
-
- 11 Jun, 2025 1 commit
-
-
Ryan Olson authored
-
- 21 May, 2025 2 commits
-
-
Graham King authored
-
Graham King authored
- Stop advertising a model when it's last instance stops. Previously was when any instance stops. - Faster locks on model manager. - Move discovery code out of http, as it is used by all inputs.
-