"benchmark/git@developer.sourcefind.cn:change/sglang.git" did not exist on "0f52fb55ecd992f856892dc9fdec281e467e1ca9"
Add AdaFactor optimizer from fairseq (#6722)
* AdaFactor optimizer ported from fairseq. Tested for T5 finetuning and MLM -- reduced memory consumption compared to ADAM. * update PR fixes, add basic test * bug -- incorrect params in test * bugfix -- import Adafactor into test * bugfix -- removed accidental T5 include * resetting T5 to master * bugfix -- include Adafactor in __init__ * longer loop for adafactor test * remove double error class declare * lint * black * isort * Update src/transformers/optimization.py Co-authored-by:Sam Shleifer <sshleifer@gmail.com> * single docstring * Cleanup docstring Co-authored-by:
Nikolai Y <nikolai.yakovenko@point72.com> Co-authored-by:
Sam Shleifer <sshleifer@gmail.com>
Showing
Please register or sign in to comment