Remove the need for `einsum` in Albert's attention computation (#12394)
* debug albert einsum * Fix matmul computation * Let's use torch linear layer. * Style.
Showing
Please register or sign in to comment
* debug albert einsum * Fix matmul computation * Let's use torch linear layer. * Style.