"vllm/vscode:/vscode.git/clone" did not exist on "e8ddc08ec85495e5faca31bdf9129e0bf59a4fac"
[Workaround] Use bf16 lds to save fp32 input
quantize_transpose_vector_blockwise function use lds exceeding 64kb when
input type is fp32. But max size of lds in dcu is 64kb, thus we use lds
as bfp16 for workaround.
Signed-off-by:
wenjh <wenjh@sugon.com>
Showing
Please register or sign in to comment