Commits · 2df34f4aba7ffbf47974f121767be052bebb23ca · chenpangpang / transformers

"vscode:/vscode.git/clone" did not exist on "e1205e478a9d13445575575c37169f5dd784b863"

13 Jan, 2021 1 commit

[trainer] deepspeed integration (#9211) · 2df34f4a

Stas Bekman authored Jan 12, 2021



* deepspeed integration

* style

* add test

* ds wants to do its own backward

* fp16 assert

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

* for clarity extract what args are being passed to deepspeed

* introduce the concept of self.wrapped_model

* s/self.wrapped_model/self.model_wrapped/

* complete transition to self.wrapped_model / self.model

* fix

* doc

* give ds its own init

* add custom overrides, handle bs correctly

* fix test

* clean up model_init logic, fix small bug

* complete fix

* collapse --deepspeed_config into --deepspeed

* style

* start adding doc notes

* style

* implement hf2ds optimizer and scheduler configuration remapping

* oops

* call get_num_training_steps absolutely when needed

* workaround broken auto-formatter

* deepspeed_config arg is no longer needed - fixed in deepspeed master

* use hf's fp16 args in config

* clean

* start on the docs

* rebase cleanup

* finish up --fp16

* clarify the supported stages

* big refactor thanks to discovering deepspeed.init_distributed

* cleanup

* revert fp16 part

* add checkpoint-support

* more init ds into integrations

* extend docs

* cleanup

* unfix docs

* clean up old code

* imports

* move docs

* fix logic

* make it clear which file it's referring to

* document nodes/gpus

* style

* wrong format

* style

* deepspeed handles gradient clipping

* easier to read

* major doc rewrite

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* docs

* switch to AdamW optimizer

* style

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* clarify doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

2df34f4a

23 Dec, 2020 1 commit
- Adapt to new name of `label_smoothing_factor` training arg (#9282) · a1cb6e98
  Sylvain Gugger authored Dec 23, 2020
  
  a1cb6e98
22 Dec, 2020 2 commits

Revert renaming in finetune_trainer (#9262) · e6c1f1ca
Sylvain Gugger authored Dec 22, 2020

e6c1f1ca

Seq2seq trainer (#9241) · 490b39e6

Sylvain Gugger authored Dec 22, 2020



* Add label smoothing in Trainer

* Add options for scheduler and Adafactor in Trainer

* Put Seq2SeqTrainer in the main lib

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments and adapt scripts

* Documentation

* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

490b39e6

18 Dec, 2020 1 commit
- [trainer] apex fixes and tests (#9180) · f06d0fad
  Stas Bekman authored Dec 17, 2020
  
  f06d0fad
17 Dec, 2020 1 commit
- add tests for the new sharded ddp fairscale integration (#9177) · 63841c55
  Stas Bekman authored Dec 17, 2020
  
  63841c55
11 Dec, 2020 1 commit

Reorganize examples (#9010) · 783d7d26

Sylvain Gugger authored Dec 11, 2020



* Reorganize example folder

* Continue reorganization

* Change requirements for tests

* Final cleanup

* Finish regroup with tests all passing

* Copyright

* Requirements and readme

* Make a full link for the documentation

* Address review comments

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add symlink

* Reorg again

* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Adapt title

* Update to new strucutre

* Remove test

* Update READMEs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

783d7d26

30 Nov, 2020 1 commit

[s2s trainer] fix DP mode (#8823) · 7f34d757

Stas Bekman authored Nov 30, 2020

* fix DP case on multi-gpu

* make executable

* test all 3 modes

* use the correct check for distributed

* dp doesn't need a special case

* restore original name

* cleanup

7f34d757

23 Nov, 2020 1 commit
- [trainer] make generate work with multigpu (#8716) · 1e45bef0
  Stas Bekman authored Nov 23, 2020
```
* make generate work with multigpu

* better fix - thanks @sgugger
```
  1e45bef0
18 Nov, 2020 1 commit
- [s2s] broken test (#8613) · 2819da02
  Stas Bekman authored Nov 18, 2020
  
  2819da02
17 Nov, 2020 1 commit

Remove deprecated (#8604) · dd52804f

Sylvain Gugger authored Nov 17, 2020



* Remove old deprecated arguments
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* Remove needless imports

* Fix tests
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

dd52804f

08 Nov, 2020 1 commit
- [s2s test_finetune_trainer] failing multigpu test (#8400) · 66582492
  Stas Bekman authored Nov 08, 2020
  
  66582492
05 Nov, 2020 1 commit
- [s2s] test_distributed_eval (#8315) · d787935a
  Stas Bekman authored Nov 05, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
  d787935a
28 Oct, 2020 1 commit

[testing] port test_trainer_distributed to distributed pytest + TestCasePlus enhancements (#8107) · 5423f2a9

Stas Bekman authored Oct 28, 2020



* move the helper code into testing_utils

* port test_trainer_distributed to work with pytest

* improve docs

* simplify notes

* doc

* doc

* style

* doc

* further improvements

* torch might not be available

* real fix

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5423f2a9

26 Oct, 2020 1 commit

[Seq2Seq Trainer] Make sure padding is implemented for models without pad_token (#8043) · 664c7ec4

Patrick von Platen authored Oct 26, 2020

* make sure padding is implemented for non-padding tokens models as well

* add better error message

* add better warning

* remove results files

* Update examples/seq2seq/seq2seq_trainer.py

* remove unnecessary copy line

* correct usage of labels

* delete test files

664c7ec4

23 Oct, 2020 1 commit

[Examples] Allow EncoderDecoderModels to be trained with Seq2Seq (#7809) · 3c682ea1

Patrick von Platen authored Oct 23, 2020

* Make Seq2Seq Trainer more similar to Trainer

* fix typo

* fix seq2seq trainer

* remove from tests

* remove lock

* remove train files

* delete test files

* correct typo

* check at init

* make sure trainer is not slowed down on TPU

* correct isort

* remove use cache

* fix use cache

* add last use chache = false

3c682ea1

22 Oct, 2020 1 commit
- [s2s trainer] tests to use distributed on multi-gpu machine (#7965) · 023f0f37
  Stas Bekman authored Oct 22, 2020
  
  023f0f37
17 Oct, 2020 1 commit
- [s2s testing] turn all to unittests, use auto-delete temp dirs (#7859) · 9f7b2b24
  Stas Bekman authored Oct 17, 2020
  
  9f7b2b24
16 Oct, 2020 2 commits
- [seq2seq testing] improve readability (#7845) · 1652ddad
  Stas Bekman authored Oct 16, 2020
  
  1652ddad
- [cleanup] assign todos, faster bart-cnn test (#7835) · 96e47d92
  Sam Shleifer authored Oct 16, 2020
```
* 2 beam output

* unassign/remove TODOs

* remove one more
```
  96e47d92
07 Oct, 2020 1 commit

Trainer callbacks (#7596) · 08ba4b49

Sylvain Gugger authored Oct 07, 2020



* Initial callback proposal

* Finish various callbacks

* Post-rebase conflicts

* Fix tests

* Don't use something that's not set

* Documentation

* Remove unwanted print.

* Document all models can work

* Add tests + small fixes

* Update docs/source/internal/trainer_utils.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Fix TF tests

* Real fix this time

* This one should work

* Fix typo

* Really fix typo
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

08ba4b49

04 Oct, 2020 1 commit
- [s2s] add config params like Dropout in Seq2SeqTrainingArguments (#7532) · 99cb924b
  Suraj Patil authored Oct 04, 2020
  
  99cb924b
01 Oct, 2020 3 commits
- [s2s] Adafactor support for builtin trainer (#7522) · de4d7b00
  Sam Shleifer authored Oct 01, 2020
  
  de4d7b00
- Fix seq2seq example test (#7518) · bdcc4b78
  Sylvain Gugger authored Oct 01, 2020
```
* Fix seq2seq example test

* Fix bad copy-paste

* Also save the state
```
  bdcc4b78
- [s2sTrainer] test + code cleanup (#7467) · 48f23f92
  Sam Shleifer authored Oct 01, 2020
  
  48f23f92
24 Sep, 2020 1 commit
- Seq2SeqTrainer (#6769) · 9e68d075
  Suraj Patil authored Sep 25, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
  9e68d075