[Bugfix] Fix dummy weight for fp8 (#4916)

Allow dummy load format for fp8, torch.uniform_ doesn't support FP8 at the moment Co-authored-by: Mor Zusman <morz@ai21.com>

[Bugfix] Fix dummy weight for fp8 (#4916)
Allow dummy load format for fp8, torch.uniform_ doesn't support FP8 at the moment Co-authored-by: Mor Zusman <morz@ai21.com>
f0eecee6 · Mor Zusman · GitHub · 943e72ca · f0eecee6
Unverified Commit f0eecee6 authored May 20, 2024 by Mor Zusman Committed by GitHub May 20, 2024
Show whitespace changes
Inline Side-by-side

Showing with 8 additions and 1 deletion

vllm/model_executor/model_loader/weight_utils.py vllm/model_executor/model_loader/weight_utils.py +8 -1

No files found.
--- a/vllm/model_executor/model_loader/weight_utils.py
+++ b/vllm/model_executor/model_loader/weight_utils.py
@@ -369,4 +369,11 @@ def initialize_dummy_weights(
    """
    for param in model.state_dict().values():
        if torch.is_floating_point(param):
-            param.data.uniform_(low, high)
+            if torch.finfo(param.data.dtype).bits < 16:
+                # uniform_ doesn't support < 16-bit datatypes (FP8)
+                dtype = param.data.dtype
+                tmp_param = param.data.to(torch.float16)
+                tmp_param = tmp_param.uniform_(low, high).to(dtype)
+                param.data.copy_(tmp_param)
+            else:
+                param.uniform_(low, high)