[All Seq2Seq model + CLM models that can be used with EncoderDecoder] Add...
[All Seq2Seq model + CLM models that can be used with EncoderDecoder] Add cross-attention weights to outputs (#8071)
* Output cross-attention with decoder attention output
* Update src/transformers/modeling_bert.py
* add cross-attention for t5 and bart as well
* fix tests
* correct typo in docs
* add sylvains and sams comments
* correct typo
Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
Showing
Please register or sign in to comment