Unverified Commit 6b04039a authored by sstamenk's avatar sstamenk Committed by GitHub
Browse files

[BugFix] Skip the Q component for QKVParallelLinear in the case of...


[BugFix] Skip the Q component for QKVParallelLinear in the case of QKVCrossParallelLinear since its width is 0 (#22369)
Signed-off-by: default avatarsstamenk <sstamenk@amd.com>
parent 1c859a13
...@@ -121,6 +121,9 @@ def requantize_with_max_scale( ...@@ -121,6 +121,9 @@ def requantize_with_max_scale(
if unfused_module_in_checkpoint: if unfused_module_in_checkpoint:
start = 0 start = 0
for idx, logical_width in enumerate(logical_widths): for idx, logical_width in enumerate(logical_widths):
# Skip any component with zero width.
if logical_width == 0:
continue
end = start + logical_width end = start + logical_width
weight_dq = per_tensor_dequantize(weight[start:end, :], weight_dq = per_tensor_dequantize(weight[start:end, :],
weight_scale[idx]) weight_scale[idx])
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment