Deberta v2 code simplification (#15732)
* Removed spurious substraction * Fixed condition checking for attention type * Fixed sew_d copy of DeBERTa v2 attention * Removed unused `p2p` attention type from DebertaV2-class models * Fixed docs style
Showing
Please register or sign in to comment