-
Jayendra authored
* Added logic to return attention from flax-bert model and added test cases to check that * Added new line at the end of file to test_modeling_flax_common.py * fixing code style * Fixing Roberta and Elextra models too from cpoying bert * Added temporary hack to not run test_attention_outputs for FlaxGPT2 * Returning attention weights from GPT2 and changed the tests accordingly. * last fixes * bump flax dependency Co-authored-by:
jayendra <jayendra@infocusp.in> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
af1a10bf