"tests/vscode:/vscode.git/clone" did not exist on "476f659e8423c73b832c54f05fc5975e5dcf191c"
Change Flax MHA to DPA to remove the duplicated QKV projection step (#2429)
Signed-off-by:
tdophung <tdophung@nvidia.com>
Showing
Please register or sign in to comment