"git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "e8da77d181d316dd5c890e56e92cf153c7d68ee7"
[Flax] Adapt Flax models to new structure (#9484)
* Create modeling_flax_eletra with code copied from modeling_flax_bert
* Add ElectraForMaskedLM and ElectraForPretraining
* Add modeling test for Flax electra and fix naming and arg in Flax Electra model
* Add documentation
* Fix code style
* Create modeling_flax_eletra with code copied from modeling_flax_bert
* Add ElectraForMaskedLM and ElectraForPretraining
* Add modeling test for Flax electra and fix naming and arg in Flax Electra model
* Add documentation
* Fix code style
* Fix code quality
* Adjust tol in assert_almost_equal due to very small difference between model output, ranging 0.0010 - 0.0016
* Remove redundant ElectraPooler
* save intermediate
* adapt
* correct bert flax design
* adapt roberta as well
* finish roberta flax
* finish
* apply suggestions
* apply suggestions
Co-authored-by:
Chris Nguyen <anhtu2687@gmail.com>
Showing
This diff is collapsed.
This diff is collapsed.
Please register or sign in to comment