[Bugfix] Fix bnb 8bit model weights loading (#19917)

Signed-off-by: Isotr0py <2037008807@qq.com>

[Bugfix] Fix bnb 8bit model weights loading (#19917)
Signed-off-by: Isotr0py <2037008807@qq.com>
6f170f11 · Isotr0py · GitHub · 8ca81bb0 · 6f170f11
Unverified Commit 6f170f11 authored Jun 21, 2025 by Isotr0py Committed by GitHub Jun 21, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 2 additions and 2 deletions

vllm/model_executor/model_loader/bitsandbytes_loader.py vllm/model_executor/model_loader/bitsandbytes_loader.py +2 -2

No files found.
--- a/vllm/model_executor/model_loader/bitsandbytes_loader.py
+++ b/vllm/model_executor/model_loader/bitsandbytes_loader.py
@@ -577,10 +577,10 @@ def dequantize_dq(quant_states: dict) -> None:
    thereby avoiding this computational overhead during inference. This comes 
    at the cost of increased memory usage.
    """
-    from bitsandbytes.functional import dequantize_blockwise
+    from bitsandbytes.functional import QuantState, dequantize_blockwise
    for _, quant_state in quant_states.items():
        # Copied from: https://github.com/bitsandbytes-foundation/bitsandbytes/blob/0.45.3/bitsandbytes/functional.py#L1352-#L1356
-        if quant_state.nested:
+        if isinstance(quant_state, QuantState) and quant_state.nested:
            absmax = dequantize_blockwise(quant_state.absmax,
                                          quant_state.state2)
            absmax += quant_state.offset