-
xiaoxi-wangfj authored
* [PyTorch|common] Implement unpadding kernel for FP8 1. Add multi-tensor unpadding kernel 2. Replace split+cat with unpadding kernel in Fp8Padding and Fp8Unpadding 3. Add unpadding with padding unit tests Signed-off-by:
xiaoxi-wangfj <690912414@qq.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add license Signed-off-by:
Xin Yao <xiny@nvidia.com> * Update padding.cu Signed-off-by:
Xin Yao <xiny@nvidia.com> --------- Signed-off-by:
xiaoxi-wangfj <690912414@qq.com> Signed-off-by:
Xin Yao <xiny@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
Xin Yao <xiny@nvidia.com>
23cf4ff9