-
Daniel Stokes authored
* Add support for overlapping wgrad NCCL AG with dgrad GEMM Signed-off-by:
djns99 <40156487+djns99@users.noreply.github.com> * Remove unused wait on memcpy API from UB Signed-off-by:
djns99 <40156487+djns99@users.noreply.github.com> * Add better commenting to MXFP8 overlap Signed-off-by:
djns99 <40156487+djns99@users.noreply.github.com> --------- Signed-off-by:
djns99 <40156487+djns99@users.noreply.github.com> Co-authored-by:
dastokes <dastokes@dastokes-dvt-01.nvidia.com>
d90ced7c