Unverified Commit 227e48a9 authored by Kane's avatar Kane Committed by GitHub
Browse files

fix muxi int8 (#535)



1. 修复muxi int8-vllm推理结果精度问题。torch.empty导致推理结果有nan值。
Co-authored-by: default avatarroot <root@master.cluster.local>
parent 8b1cea0a
......@@ -625,7 +625,7 @@ class MMWeightWint8channelAint8channeldynamicVllm(MMWeightQuantTemplate):
shape = (input_tensor.shape[0], self.weight.shape[1])
dtype = input_tensor.dtype
device = input_tensor.device
output_tensor = torch.empty(shape, dtype=dtype, device=device, requires_grad=False)
output_tensor = torch.zeros(shape, dtype=dtype, device=device, requires_grad=False)
input_tensor_quant, input_tensor_scale = self.act_quant_func(input_tensor)
torch.ops._C.cutlass_scaled_mm(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment