"git@developer.sourcefind.cn:OpenDAS/TransformerEngine.git" did not exist on "136acacbc3972019eadaaa7678b8c689bd8f1320"
Change Flax MHA to DPA to remove the duplicated QKV projection step (#2429)
Signed-off-by:
tdophung <tdophung@nvidia.com>
Showing
Please register or sign in to comment