chore: adjust gpu-memory-utilization to accommodate vLLM's runtime GPU memory requirement (#5755)
Signed-off-by:
Guan Luo <gluo@nvidia.com>
Showing
Please register or sign in to comment
Signed-off-by:
Guan Luo <gluo@nvidia.com>