"...git@developer.sourcefind.cn:OpenDAS/TransformerEngine.git" did not exist on "06947e87b5511f8ad69ccd00286de9227f0fad24"
Change Flax MHA to DPA to remove the duplicated QKV projection step (#2429)
Signed-off-by:
tdophung <tdophung@nvidia.com>
Showing
Please register or sign in to comment