"...git@developer.sourcefind.cn:kecinstone/2024-pra-vllm.git" did not exist on "e5452ddfd6e9a08d5e15bd81a010934550b9b507"
-
Hongbin Liu authored
* disable wgrad accumulation and reduce in backward() And manually launch it in backward_dw() Signed-off-by:
Hongbin Liu <hongbinl@nvidia.com> * format Signed-off-by:
Hongbin Liu <hongbinl@nvidia.com> * refactor Signed-off-by:
Hongbin Liu <hongbinl@nvidia.com> * refactor Signed-off-by:
Hongbin Liu <hongbinl@nvidia.com> * set skip_backward_post_hook to True only if delay_wgrad_compute is True Signed-off-by:
Hongbin Liu <hongbinl@nvidia.com> * format Signed-off-by:
Hongbin Liu <hongbinl@nvidia.com> --------- Signed-off-by:
Hongbin Liu <hongbinl@nvidia.com> Co-authored-by:
Hongbin Liu <hongbinl@nvidia.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
1258bbe0