"vscode:/vscode.git/clone" did not exist on "d27f4bae393214b4e7715fc3cb5754d4bf801bce"
Lower precision gated-act to accelerate FP8 current-scaling. (#2153)
* Applying the original precision as Norm outputs' and activation compuations. Signed-off-by:Ming Huang <mingh@nvidia.com> * Adding knob to control norm output precision. Signed-off-by:
Ming Huang <mingh@nvidia.com> * Removing the knob and applying lower-precision norm with current-scaling only. Signed-off-by:
Ming Huang <mingh@nvidia.com> * Fix the error when quantizer==None Signed-off-by:
Ming Huang <mingh@nvidia.com> --------- Signed-off-by:
Ming Huang <mingh@nvidia.com>
Showing
Please register or sign in to comment