"tests/models/mpnet/test_tokenization_mpnet.py" did not exist on "df2af6d8b8765b1ac2cda12d2ece09bf7240fba8"
Fix llama model sdpa attention forward function masking bug when output_attentions=True (#30652)
* Fix llama model forward function with attention=True, same-length encoded sequence. * Fix style * propagate fix to modeling_cohere, gemma, dbrx, and olmo (which copy the same sdpa masking logic from llama) * Fix style * ignore unnecessary sdpa mask converter when output_attentions=True * add tests checking sdpa and eager outputs match when output_attentions=True * Split if statements in two lines Co-authored-by:Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix formatting * Add fix to new jetmoe model * Add missing output_attentions argument to jetmoe mask creation --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
Showing
Please register or sign in to comment