[JAX] Refine MHA API and add DPA API (#653)
* Refine MHA API Signed-off-by:Reese Wang <rewang@nvidia.com> * Reuse func from the flax Signed-off-by:
Reese Wang <rewang@nvidia.com> * DPA draft Signed-off-by:
Reese Wang <rewang@nvidia.com> * qkv packed draft Signed-off-by:
Reese Wang <rewang@nvidia.com> * Fix test_layer with fused attn Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add attn_bias_type and enhance a few code flow Signed-off-by:
Reese Wang <rewang@nvidia.com> * Move scale_factor from __call__ to init Signed-off-by:
Reese Wang <rewang@nvidia.com> * Enhance the docs Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add DPA public API and tests Signed-off-by:
Reese Wang <rewang@nvidia.com> * Refine docs Signed-off-by:
Reese Wang <rewang@nvidia.com> * Refine docs Signed-off-by:
Reese Wang <rewang@nvidia.com> * Fix conflict Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add qkv separate fused attn Signed-off-by:
Reese Wang <rewang@nvidia.com> * Apply BSHD_BSHD_BSHD format Signed-off-by:
Reese Wang <rewang@nvidia.com> * Remove debug log Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add fused attention layer tests Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add NVTE_FUSED_ATTN docs Signed-off-by:
Reese Wang <rewang@nvidia.com> * Fine-grained fused attn settings Signed-off-by:
Reese Wang <rewang@nvidia.com> * Remove the default value of num_attetnion_head and head_dim Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add teardown for fused attn env Signed-off-by:
Reese Wang <rewang@nvidia.com> * Unify the Optional notation Signed-off-by:
Reese Wang <rewang@nvidia.com> * Fix Pre/Post scale bias comments Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add no_mask tests Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add checkpoint_name for fused attn Signed-off-by:
Reese Wang <rewang@nvidia.com> * Fix the fused attn batcher Signed-off-by:
Reese Wang <rewang@nvidia.com> --------- Signed-off-by:
Reese Wang <rewang@nvidia.com>
Showing
This diff is collapsed.
This diff is collapsed.
Please register or sign in to comment