"...git@developer.sourcefind.cn:kecinstone/2024-pra-vllm.git" did not exist on "0fbfc4b81b9208f13ceb82d1ea92ff14a6e56088"
[PyTorch] Set usages for linear op quantizers before forward (#2222)
* Make sure to set usages for linear op quantizers before forward Signed-off-by:Tim Moon <tmoon@nvidia.com> * Avoid unsupported case for fused dbias+quantize kernel Hopper does not support dbias + FP8 cast without FP8 transpose. Signed-off-by:
Tim Moon <tmoon@nvidia.com> --------- Signed-off-by:
Tim Moon <tmoon@nvidia.com>
Showing
Please register or sign in to comment