[ROCm] Improve error handling while loading quantized model on gfx120… (#31715)

Signed-off-by: brian033 <85883730+brian033@users.noreply.github.com> Co-authored-by: TJian <tunjian.tan@embeddedllm.com>

[ROCm] Improve error handling while loading quantized model on gfx120… (#31715)
Signed-off-by: brian033 <85883730+brian033@users.noreply.github.com> Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
b89275d0 · brian033 · GitHub · 28459785 · b89275d0
Unverified Commit b89275d0 authored Jan 15, 2026 by brian033 Committed by GitHub Jan 15, 2026
Hide whitespace changes
Inline Side-by-side

Showing with 5 additions and 1 deletion

vllm/model_executor/layers/quantization/quark/schemes/quark_ocp_mx.py ...xecutor/layers/quantization/quark/schemes/quark_ocp_mx.py +5 -1

No files found.
--- a/vllm/model_executor/layers/quantization/quark/schemes/quark_ocp_mx.py
+++ b/vllm/model_executor/layers/quantization/quark/schemes/quark_ocp_mx.py
@@ -153,7 +153,11 @@ try:
        fake_impl=gemm_with_dynamic_quant_fake,
        dispatch_key=current_platform.dispatch_key,
    )
-except (ImportError, AttributeError):
+except (ImportError, AttributeError, RuntimeError):
+    logger.warning(
+        "AITER is not found or QuarkOCP_MX is not supported on the current "
+        "platform. QuarkOCP_MX quantization will not be available."
+    )
    dynamic_mxfp4_quant = gemm_afp4wfp4 = None