Use scaled_dot_product_attention in Wav2vec2/HuBERT's SelfAttention (#3253)
Summary: Replace the attention computation with `torch.nn.functional.scaled_dot_product_attention` to improve running efficiency. Pull Request resolved: https://github.com/pytorch/audio/pull/3253 Reviewed By: mthrok Differential Revision: D44800353 Pulled By: nateanl fbshipit-source-id: 41550d868c809099aadbe812b0ebe2c38121efb8
Showing
Please register or sign in to comment