-
Shijie authored
* use separate qkv Signed-off-by:
jaywan <jaywan@nvidia.com> add support for GQA Signed-off-by:
jaywan <jaywan@nvidia.com> minor changes Signed-off-by:
Shijie Wang <jaywan@nvidia.com> change rtol Signed-off-by:
Shijie Wang <jaywan@nvidia.com> fix reshape issue Signed-off-by:
Shijie Wang <jaywan@nvidia.com> add rmsnorm and rotary position embedding Signed-off-by:
Shijie Wang <jaywan@nvidia.com> update rmsnorm Signed-off-by:
Shijie Wang <jaywan@nvidia.com> refactor layernorm and rmsnorm Signed-off-by:
Shijie Wang <jaywan@nvidia.com> support swiglu Signed-off-by:
Shijie Wang <jaywan@nvidia.com> add fused rope Signed-off-by:
Shijie Wang <jaywan@nvidia.com> minor changes Signed-off-by:
Shijie Wang <jaywan@nvidia.com> add rope api to __init__ Signed-off-by:
Shijie Wang <jaywan@nvidia.com> minor changes Signed-off-by:
Shijie Wang <jaywan@nvidia.com> fix fp8 dtype issue Signed-off-by:
Shijie Wang <jaywan@nvidia.com> * simplify ut cases Signed-off-by:
jaywan <jaywan@nvidia.com> * Update transformer_engine/paddle/layer/attention.py Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by:
Shijie <505749828@qq.com> * fix name issue Signed-off-by:
Shijie Wang <jaywan@nvidia.com> --------- Signed-off-by:
Shijie Wang <jaywan@nvidia.com> Signed-off-by:
jaywan <jaywan@nvidia.com> Signed-off-by:
Shijie <505749828@qq.com> Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com>
71725099