"...git@developer.sourcefind.cn:kecinstone/2024-pra-vllm.git" did not exist on "3e9f991d6acd7efd90f04f1f530b837a40c93442"
-
Tim Moon authored
* Coalesce NCCL all-gathers for MXFP8 all-gather Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Add missing import Signed-off-by:
Tim Moon <tmoon@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cache quantized input tensor after linear module forward pass Signed-off-by:
Tim Moon <tmoon@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix linter warnings Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Avoid unnecessarily allocating layernorm output in LayerNormLinear/LayerNormMLP Signed-off-by:
Tim Moon <tmoon@nvidia.com> --------- Signed-off-by:
Tim Moon <tmoon@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
0356010c