"git@developer.sourcefind.cn:OpenDAS/TransformerEngine.git" did not exist on "7804d1167d78c29867b9f32ade5b7520be3bb870"
Add mixed use of cuDNN fprop and flash-attn v2 bprop (#349)
* Add support for cuDNN fprop and FAv2 bprop Signed-off-by:Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * minor fixes Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * skip activation recompute tests if FAv2 Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * restrict the use of FAv2 bprop to H100 only Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * move use_FAv2_bwd check to init Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * remove skipifs for FAv2 in test numerics Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix typos and wording for deterministic checks Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * Remove variables related to FAv2 skipifs Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
cyanguwa <8636796+cyanguwa@users.noreply.github.com> --------- Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> Signed-off-by:
cyanguwa <8636796+cyanguwa@users.noreply.github.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
Showing
Please register or sign in to comment