fix(vllm): cap gpu_memory_utilization for encoder-only vision model load (#8466)
Co-authored-by:
Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Showing
Please register or sign in to comment
Co-authored-by:
Claude Opus 4.7 (1M context) <noreply@anthropic.com>