"git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "4dc65591b5c61d75c3ef3a2a883bf1433e08fc45"
Flax Remat for LongT5 (#17994)
* [Flax] Add remat (gradient checkpointing)
* fix variable naming in test
* flip: checkpoint using a method
* fix naming
* fix class naming
* apply PVP's suggestions from code review
* add gradient_checkpointing to examples
* Add gradient_checkpointing to run_mlm_flax
* Add remat to longt5
* Add gradient checkpointing test longt5
* Fix args errors
* Fix remaining tests
* Make fixup & quality fixes
* replace kwargs
* remove unecessary kwargs
* Make fixup changes
* revert long_t5_flax changes
* Remove return_dict and copy to LongT5
* Remove test_gradient_checkpointing
Co-authored-by:
sanchit-gandhi <sanchit@huggingface.co>
Showing
Please register or sign in to comment