Commits · cabcc75171650f9131a4cf31c62e1f102589014e · chenpangpang / transformers

20 Jul, 2021 1 commit

[trainer] sanity checks for `save_steps=0|None` and `logging_steps=0` (#12796) · cabcc751

Stas Bekman authored Jul 20, 2021



* [trainer] fix % 0

* sanity checks

* fix logging_strategy

* correction

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

cabcc751

14 Jul, 2021 1 commit
- [test] split test into 4 sub-tests to avoid timeout (#12710) · a18a17d2
  Stas Bekman authored Jul 14, 2021
```
* split the test into 4 sub-tests to avoid timeout

* fix decorator order
```
  a18a17d2
12 Jul, 2021 1 commit
- The extended trainer tests should require torch (#12650) · fb5665b5
  Lysandre Debut authored Jul 12, 2021
  
  fb5665b5
22 Jun, 2021 1 commit
- [trainer] 2 bug fixes and a rename (#12309) · ebe54135
  Stas Bekman authored Jun 22, 2021
```
* bug fixes and a rename

* add extended DDP test
```
  ebe54135
15 Jun, 2021 1 commit
- [testing] ensure concurrent pytest workers use a unique port for torch.dist (#12166) · 6e7cc5cc
  Stas Bekman authored Jun 15, 2021
```
* ensure concurrent pytest workers use a unique port for torch.distributed.launch

* reword
```
  6e7cc5cc
06 May, 2021 1 commit
- [cuda ext tests] fixing tests (#11619) · 619200cc
  Stas Bekman authored May 06, 2021
```
* fixing tests

* cleanup
```
  619200cc
26 Apr, 2021 1 commit

[Examples] Fixes inconsistency around eval vs val and predict vs test (#11380) · 1d30ec95

Bhadresh Savani authored Apr 26, 2021

* added changes for uniformity

* modified files

* corrected typo

* fixed qa scripts

* fix typos

* fixed predict typo in qa no trainer

* fixed test file

* reverted trainer changes

* reverted trainer changes in custom exmaples

* updated readme

* added changes in deepspeed test

* added changes for predict and eval

1d30ec95

21 Apr, 2021 1 commit

Examples reorg (#11350) · dabeb152

Sylvain Gugger authored Apr 21, 2021



* Base move

* Examples reorganization

* Update references

* Put back test data

* Move conftest

* More fixes

* Move test data to test fixtures

* Update path

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments and clean
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

dabeb152

08 Apr, 2021 1 commit

[tests] relocate core integration tests (#11146) · 66446909

Stas Bekman authored Apr 08, 2021

* relocate core integration tests

* add sys.path context manager

* cleanup

* try

* try2

* fix path

* doc

* style

* add dep

* add 2 more deps

66446909

15 Mar, 2021 1 commit

split seq2seq script into summarization & translation (#10611) · 6f840990

Théo Matussière authored Mar 15, 2021



* split seq2seq script, update docs

* needless diff

* fix readme

* remove test diff

* s/summarization/translation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* cr

* fix arguments & better mbart/t5 refs

* copyright
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* reword readme
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* s/summarization/translation

* short script names

* fix tests

* fix isort, include mbart doc

* delete old script, update tests

* automate source prefix

* automate source prefix for translation

* s/translation/trans
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* fix script name (short version)

* typos
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* exact parameter
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* remove superfluous source_prefix calls in docs

* rename scripts & warn for source prefix

* black

* flake8
Co-authored-by: theo <theo@matussie.re>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

6f840990

09 Mar, 2021 1 commit
- Fairscale FSDP fix model save (#10596) · 0d909f6b
  Sylvain Gugger authored Mar 09, 2021
```
* Hotfix fairscale FSDP

* Evaluation works

* Save on process zero
```
  0d909f6b
08 Mar, 2021 1 commit
- [examples tests] various fixes (#10584) · 917f1045
  Stas Bekman authored Mar 08, 2021
```
* fix sharded ddp enum

* test fixes

* stronger validation + apex breaks other tests
```
  917f1045
25 Feb, 2021 1 commit

Add support for ZeRO-2/3 and ZeRO-offload in fairscale (#10354) · 9d14be5c

Sylvain Gugger authored Feb 25, 2021



* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale

* Quality

* Rework from review comments

* Add doc

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

9d14be5c

15 Feb, 2021 1 commit

fix run_seq2seq.py; porting trainer tests to it (#10162) · 0b1f552a

Stas Bekman authored Feb 15, 2021

* fix run_seq2seq.py; porting DeepSpeed tests to it

* unrefactor

* defensive programming

* defensive programming 2

* port the rest of the trainer tests

* style

* a cleaner scripts dir finder

* cleanup

0b1f552a

08 Feb, 2021 1 commit
- [trainer] deepspeed bug fixes and tests (#10039) · 322037e8
  Stas Bekman authored Feb 08, 2021
```
* deepspeed bug fixes and tests

* manual wrap?
```
  322037e8
15 Jan, 2021 1 commit
- deepspeed + grad acumm (#9622) · c60e0e1e
  Stas Bekman authored Jan 15, 2021
  
  c60e0e1e
14 Jan, 2021 1 commit

Upstream (and rename) sortish sampler (#9574) · 329fe274

Sylvain Gugger authored Jan 14, 2021



* Upstream (and rename) sortish sampler

* Use proper sampler

* Update src/transformers/trainer_pt_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

329fe274

13 Jan, 2021 1 commit

[trainer] deepspeed integration (#9211) · 2df34f4a

Stas Bekman authored Jan 12, 2021



* deepspeed integration

* style

* add test

* ds wants to do its own backward

* fp16 assert

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

* for clarity extract what args are being passed to deepspeed

* introduce the concept of self.wrapped_model

* s/self.wrapped_model/self.model_wrapped/

* complete transition to self.wrapped_model / self.model

* fix

* doc

* give ds its own init

* add custom overrides, handle bs correctly

* fix test

* clean up model_init logic, fix small bug

* complete fix

* collapse --deepspeed_config into --deepspeed

* style

* start adding doc notes

* style

* implement hf2ds optimizer and scheduler configuration remapping

* oops

* call get_num_training_steps absolutely when needed

* workaround broken auto-formatter

* deepspeed_config arg is no longer needed - fixed in deepspeed master

* use hf's fp16 args in config

* clean

* start on the docs

* rebase cleanup

* finish up --fp16

* clarify the supported stages

* big refactor thanks to discovering deepspeed.init_distributed

* cleanup

* revert fp16 part

* add checkpoint-support

* more init ds into integrations

* extend docs

* cleanup

* unfix docs

* clean up old code

* imports

* move docs

* fix logic

* make it clear which file it's referring to

* document nodes/gpus

* style

* wrong format

* style

* deepspeed handles gradient clipping

* easier to read

* major doc rewrite

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* docs

* switch to AdamW optimizer

* style

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* clarify doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

2df34f4a

23 Dec, 2020 1 commit
- Adapt to new name of `label_smoothing_factor` training arg (#9282) · a1cb6e98
  Sylvain Gugger authored Dec 23, 2020
  
  a1cb6e98
22 Dec, 2020 2 commits

Revert renaming in finetune_trainer (#9262) · e6c1f1ca
Sylvain Gugger authored Dec 22, 2020

e6c1f1ca

Seq2seq trainer (#9241) · 490b39e6

Sylvain Gugger authored Dec 22, 2020



* Add label smoothing in Trainer

* Add options for scheduler and Adafactor in Trainer

* Put Seq2SeqTrainer in the main lib

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments and adapt scripts

* Documentation

* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

490b39e6

18 Dec, 2020 1 commit
- [trainer] apex fixes and tests (#9180) · f06d0fad
  Stas Bekman authored Dec 17, 2020
  
  f06d0fad
17 Dec, 2020 1 commit
- add tests for the new sharded ddp fairscale integration (#9177) · 63841c55
  Stas Bekman authored Dec 17, 2020
  
  63841c55
11 Dec, 2020 1 commit

Reorganize examples (#9010) · 783d7d26

Sylvain Gugger authored Dec 11, 2020



* Reorganize example folder

* Continue reorganization

* Change requirements for tests

* Final cleanup

* Finish regroup with tests all passing

* Copyright

* Requirements and readme

* Make a full link for the documentation

* Address review comments

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add symlink

* Reorg again

* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Adapt title

* Update to new strucutre

* Remove test

* Update READMEs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

783d7d26

30 Nov, 2020 1 commit

[s2s trainer] fix DP mode (#8823) · 7f34d757

Stas Bekman authored Nov 30, 2020

* fix DP case on multi-gpu

* make executable

* test all 3 modes

* use the correct check for distributed

* dp doesn't need a special case

* restore original name

* cleanup

7f34d757

23 Nov, 2020 1 commit
- [trainer] make generate work with multigpu (#8716) · 1e45bef0
  Stas Bekman authored Nov 23, 2020
```
* make generate work with multigpu

* better fix - thanks @sgugger
```
  1e45bef0
18 Nov, 2020 1 commit
- [s2s] broken test (#8613) · 2819da02
  Stas Bekman authored Nov 18, 2020
  
  2819da02
17 Nov, 2020 1 commit

Remove deprecated (#8604) · dd52804f

Sylvain Gugger authored Nov 17, 2020



* Remove old deprecated arguments
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* Remove needless imports

* Fix tests
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

dd52804f

08 Nov, 2020 1 commit
- [s2s test_finetune_trainer] failing multigpu test (#8400) · 66582492
  Stas Bekman authored Nov 08, 2020
  
  66582492
05 Nov, 2020 1 commit
- [s2s] test_distributed_eval (#8315) · d787935a
  Stas Bekman authored Nov 05, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
  d787935a
28 Oct, 2020 1 commit

[testing] port test_trainer_distributed to distributed pytest + TestCasePlus enhancements (#8107) · 5423f2a9

Stas Bekman authored Oct 28, 2020



* move the helper code into testing_utils

* port test_trainer_distributed to work with pytest

* improve docs

* simplify notes

* doc

* doc

* style

* doc

* further improvements

* torch might not be available

* real fix

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5423f2a9

26 Oct, 2020 1 commit

[Seq2Seq Trainer] Make sure padding is implemented for models without pad_token (#8043) · 664c7ec4

Patrick von Platen authored Oct 26, 2020

* make sure padding is implemented for non-padding tokens models as well

* add better error message

* add better warning

* remove results files

* Update examples/seq2seq/seq2seq_trainer.py

* remove unnecessary copy line

* correct usage of labels

* delete test files

664c7ec4

23 Oct, 2020 1 commit

[Examples] Allow EncoderDecoderModels to be trained with Seq2Seq (#7809) · 3c682ea1

Patrick von Platen authored Oct 23, 2020

* Make Seq2Seq Trainer more similar to Trainer

* fix typo

* fix seq2seq trainer

* remove from tests

* remove lock

* remove train files

* delete test files

* correct typo

* check at init

* make sure trainer is not slowed down on TPU

* correct isort

* remove use cache

* fix use cache

* add last use chache = false

3c682ea1

22 Oct, 2020 1 commit
- [s2s trainer] tests to use distributed on multi-gpu machine (#7965) · 023f0f37
  Stas Bekman authored Oct 22, 2020
  
  023f0f37
17 Oct, 2020 1 commit
- [s2s testing] turn all to unittests, use auto-delete temp dirs (#7859) · 9f7b2b24
  Stas Bekman authored Oct 17, 2020
  
  9f7b2b24
16 Oct, 2020 2 commits
- [seq2seq testing] improve readability (#7845) · 1652ddad
  Stas Bekman authored Oct 16, 2020
  
  1652ddad
- [cleanup] assign todos, faster bart-cnn test (#7835) · 96e47d92
  Sam Shleifer authored Oct 16, 2020
```
* 2 beam output

* unassign/remove TODOs

* remove one more
```
  96e47d92
07 Oct, 2020 1 commit

Trainer callbacks (#7596) · 08ba4b49

Sylvain Gugger authored Oct 07, 2020



* Initial callback proposal

* Finish various callbacks

* Post-rebase conflicts

* Fix tests

* Don't use something that's not set

* Documentation

* Remove unwanted print.

* Document all models can work

* Add tests + small fixes

* Update docs/source/internal/trainer_utils.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Fix TF tests

* Real fix this time

* This one should work

* Fix typo

* Really fix typo
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

08ba4b49

04 Oct, 2020 1 commit
- [s2s] add config params like Dropout in Seq2SeqTrainingArguments (#7532) · 99cb924b
  Suraj Patil authored Oct 04, 2020
  
  99cb924b
01 Oct, 2020 1 commit
- [s2s] Adafactor support for builtin trainer (#7522) · de4d7b00
  Sam Shleifer authored Oct 01, 2020
  
  de4d7b00