-
Michael Goldfarb authored
Correct fused attention output after each step to reduce intermediate memory use. Signed-off-by:Michael Goldfarb <mgoldfarb@nvidia.com>
a4cb1d17
Correct fused attention output after each step to reduce intermediate memory use.
Signed-off-by:
Michael Goldfarb <mgoldfarb@nvidia.com>