Commits · 38043d8453b82a9c712f8d5c98323150fbee7503 · chenpangpang / transformers

13 May, 2022 1 commit

Update self-push workflow (#17177) · 38043d84

Yih-Dar authored May 13, 2022



* update push ci

* install git-python

* update comment

* update deepspeed jobs

* fix report

* skip 2 more tests that require fairscale

* Fix changes in test_fetcher.py (to deal with `setup.py` is changed)

* set RUN_PT_TF_CROSS_TESTS=1 and final clean-up

* remove SIGOPT_API_TOKEN

* remove echo "$matrix_folders"
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

38043d84

12 May, 2022 1 commit

Black preview (#17217) · afe5d42d

Sylvain Gugger authored May 12, 2022

* Black preview

* Fixup too!

* Fix check copies

* Use the same version as the CI

* Bump black

afe5d42d

25 Apr, 2022 1 commit
- Fix issue probably-meant-fstring found at https://codereview.doctor (#16913) · 65687520
  code-review-doctor authored Apr 25, 2022
  
  65687520
19 Apr, 2022 1 commit

Add support for bitsandbytes (#15622) · 3104036e

Manuel R. Ciosici authored Apr 19, 2022



* Add initial BNB integration

* fixup! Add initial BNB integration

* Add bnb test decorator

* Update Adamw8bit option name

* Use the full bnb package name

* Overide bnb for all embedding layers

* Fix package name

* Formatting

* Remove unnecessary import

* Update src/transformers/trainer.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Rename AdamwBNB optimizer option

* Add training test checking that bnb memory utilization is lower

* fix merge

* fix merge; fix + extend new test

* cleanup

* expand bnb

* move all require_* candidates to testing_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>

3104036e

23 Mar, 2022 1 commit

Reorganize file utils (#16264) · 4975002d

Sylvain Gugger authored Mar 23, 2022

* Split file_utils in several submodules

* Fixes

* Add back more objects

* More fixes

* Who exactly decided to import that from there?

* Second suggestion to code with code review

* Revert wront move

* Fix imports

* Adapt all imports

* Adapt all imports everywhere

* Revert this import, will fix in a separate commit

4975002d

23 Dec, 2021 1 commit
- Fix failing GPU trainer tests (#14903) · f566c6e3
  Sylvain Gugger authored Dec 23, 2021
```
* Fix failing GPU trainer tests

* Remove print statements
```
  f566c6e3
20 Jul, 2021 1 commit

[trainer] sanity checks for `save_steps=0|None` and `logging_steps=0` (#12796) · cabcc751

Stas Bekman authored Jul 20, 2021



* [trainer] fix % 0

* sanity checks

* fix logging_strategy

* correction

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

cabcc751

14 Jul, 2021 1 commit
- [test] split test into 4 sub-tests to avoid timeout (#12710) · a18a17d2
  Stas Bekman authored Jul 14, 2021
```
* split the test into 4 sub-tests to avoid timeout

* fix decorator order
```
  a18a17d2
12 Jul, 2021 1 commit
- The extended trainer tests should require torch (#12650) · fb5665b5
  Lysandre Debut authored Jul 12, 2021
  
  fb5665b5
22 Jun, 2021 1 commit
- [trainer] 2 bug fixes and a rename (#12309) · ebe54135
  Stas Bekman authored Jun 22, 2021
```
* bug fixes and a rename

* add extended DDP test
```
  ebe54135
15 Jun, 2021 1 commit
- [testing] ensure concurrent pytest workers use a unique port for torch.dist (#12166) · 6e7cc5cc
  Stas Bekman authored Jun 15, 2021
```
* ensure concurrent pytest workers use a unique port for torch.distributed.launch

* reword
```
  6e7cc5cc
06 May, 2021 1 commit
- [cuda ext tests] fixing tests (#11619) · 619200cc
  Stas Bekman authored May 06, 2021
```
* fixing tests

* cleanup
```
  619200cc
26 Apr, 2021 1 commit

[Examples] Fixes inconsistency around eval vs val and predict vs test (#11380) · 1d30ec95

Bhadresh Savani authored Apr 26, 2021

* added changes for uniformity

* modified files

* corrected typo

* fixed qa scripts

* fix typos

* fixed predict typo in qa no trainer

* fixed test file

* reverted trainer changes

* reverted trainer changes in custom exmaples

* updated readme

* added changes in deepspeed test

* added changes for predict and eval

1d30ec95

21 Apr, 2021 1 commit

Examples reorg (#11350) · dabeb152

Sylvain Gugger authored Apr 21, 2021



* Base move

* Examples reorganization

* Update references

* Put back test data

* Move conftest

* More fixes

* Move test data to test fixtures

* Update path

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments and clean
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

dabeb152

08 Apr, 2021 1 commit

[tests] relocate core integration tests (#11146) · 66446909

Stas Bekman authored Apr 08, 2021

* relocate core integration tests

* add sys.path context manager

* cleanup

* try

* try2

* fix path

* doc

* style

* add dep

* add 2 more deps

66446909

15 Mar, 2021 1 commit

split seq2seq script into summarization & translation (#10611) · 6f840990

Théo Matussière authored Mar 15, 2021



* split seq2seq script, update docs

* needless diff

* fix readme

* remove test diff

* s/summarization/translation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* cr

* fix arguments & better mbart/t5 refs

* copyright
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* reword readme
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* s/summarization/translation

* short script names

* fix tests

* fix isort, include mbart doc

* delete old script, update tests

* automate source prefix

* automate source prefix for translation

* s/translation/trans
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* fix script name (short version)

* typos
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* exact parameter
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* remove superfluous source_prefix calls in docs

* rename scripts & warn for source prefix

* black

* flake8
Co-authored-by: theo <theo@matussie.re>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

6f840990

09 Mar, 2021 1 commit
- Fairscale FSDP fix model save (#10596) · 0d909f6b
  Sylvain Gugger authored Mar 09, 2021
```
* Hotfix fairscale FSDP

* Evaluation works

* Save on process zero
```
  0d909f6b
08 Mar, 2021 1 commit
- [examples tests] various fixes (#10584) · 917f1045
  Stas Bekman authored Mar 08, 2021
```
* fix sharded ddp enum

* test fixes

* stronger validation + apex breaks other tests
```
  917f1045
25 Feb, 2021 1 commit

Add support for ZeRO-2/3 and ZeRO-offload in fairscale (#10354) · 9d14be5c

Sylvain Gugger authored Feb 25, 2021



* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale

* Quality

* Rework from review comments

* Add doc

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

9d14be5c

15 Feb, 2021 1 commit

fix run_seq2seq.py; porting trainer tests to it (#10162) · 0b1f552a

Stas Bekman authored Feb 15, 2021

* fix run_seq2seq.py; porting DeepSpeed tests to it

* unrefactor

* defensive programming

* defensive programming 2

* port the rest of the trainer tests

* style

* a cleaner scripts dir finder

* cleanup

0b1f552a

08 Feb, 2021 1 commit
- [trainer] deepspeed bug fixes and tests (#10039) · 322037e8
  Stas Bekman authored Feb 08, 2021
```
* deepspeed bug fixes and tests

* manual wrap?
```
  322037e8
15 Jan, 2021 1 commit
- deepspeed + grad acumm (#9622) · c60e0e1e
  Stas Bekman authored Jan 15, 2021
  
  c60e0e1e
14 Jan, 2021 1 commit

Upstream (and rename) sortish sampler (#9574) · 329fe274

Sylvain Gugger authored Jan 14, 2021



* Upstream (and rename) sortish sampler

* Use proper sampler

* Update src/transformers/trainer_pt_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

329fe274

13 Jan, 2021 1 commit

[trainer] deepspeed integration (#9211) · 2df34f4a

Stas Bekman authored Jan 12, 2021



* deepspeed integration

* style

* add test

* ds wants to do its own backward

* fp16 assert

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

* for clarity extract what args are being passed to deepspeed

* introduce the concept of self.wrapped_model

* s/self.wrapped_model/self.model_wrapped/

* complete transition to self.wrapped_model / self.model

* fix

* doc

* give ds its own init

* add custom overrides, handle bs correctly

* fix test

* clean up model_init logic, fix small bug

* complete fix

* collapse --deepspeed_config into --deepspeed

* style

* start adding doc notes

* style

* implement hf2ds optimizer and scheduler configuration remapping

* oops

* call get_num_training_steps absolutely when needed

* workaround broken auto-formatter

* deepspeed_config arg is no longer needed - fixed in deepspeed master

* use hf's fp16 args in config

* clean

* start on the docs

* rebase cleanup

* finish up --fp16

* clarify the supported stages

* big refactor thanks to discovering deepspeed.init_distributed

* cleanup

* revert fp16 part

* add checkpoint-support

* more init ds into integrations

* extend docs

* cleanup

* unfix docs

* clean up old code

* imports

* move docs

* fix logic

* make it clear which file it's referring to

* document nodes/gpus

* style

* wrong format

* style

* deepspeed handles gradient clipping

* easier to read

* major doc rewrite

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* docs

* switch to AdamW optimizer

* style

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* clarify doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

2df34f4a

23 Dec, 2020 1 commit
- Adapt to new name of `label_smoothing_factor` training arg (#9282) · a1cb6e98
  Sylvain Gugger authored Dec 23, 2020
  
  a1cb6e98
22 Dec, 2020 2 commits

Revert renaming in finetune_trainer (#9262) · e6c1f1ca
Sylvain Gugger authored Dec 22, 2020

e6c1f1ca

Seq2seq trainer (#9241) · 490b39e6

Sylvain Gugger authored Dec 22, 2020



* Add label smoothing in Trainer

* Add options for scheduler and Adafactor in Trainer

* Put Seq2SeqTrainer in the main lib

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments and adapt scripts

* Documentation

* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

490b39e6

18 Dec, 2020 1 commit
- [trainer] apex fixes and tests (#9180) · f06d0fad
  Stas Bekman authored Dec 17, 2020
  
  f06d0fad
17 Dec, 2020 1 commit
- add tests for the new sharded ddp fairscale integration (#9177) · 63841c55
  Stas Bekman authored Dec 17, 2020
  
  63841c55
11 Dec, 2020 1 commit

Reorganize examples (#9010) · 783d7d26

Sylvain Gugger authored Dec 11, 2020



* Reorganize example folder

* Continue reorganization

* Change requirements for tests

* Final cleanup

* Finish regroup with tests all passing

* Copyright

* Requirements and readme

* Make a full link for the documentation

* Address review comments

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add symlink

* Reorg again

* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Adapt title

* Update to new strucutre

* Remove test

* Update READMEs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

783d7d26

30 Nov, 2020 1 commit

[s2s trainer] fix DP mode (#8823) · 7f34d757

Stas Bekman authored Nov 30, 2020

* fix DP case on multi-gpu

* make executable

* test all 3 modes

* use the correct check for distributed

* dp doesn't need a special case

* restore original name

* cleanup

7f34d757

23 Nov, 2020 1 commit
- [trainer] make generate work with multigpu (#8716) · 1e45bef0
  Stas Bekman authored Nov 23, 2020
```
* make generate work with multigpu

* better fix - thanks @sgugger
```
  1e45bef0
18 Nov, 2020 1 commit
- [s2s] broken test (#8613) · 2819da02
  Stas Bekman authored Nov 18, 2020
  
  2819da02
17 Nov, 2020 1 commit

Remove deprecated (#8604) · dd52804f

Sylvain Gugger authored Nov 17, 2020



* Remove old deprecated arguments
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* Remove needless imports

* Fix tests
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

dd52804f

08 Nov, 2020 1 commit
- [s2s test_finetune_trainer] failing multigpu test (#8400) · 66582492
  Stas Bekman authored Nov 08, 2020
  
  66582492
05 Nov, 2020 1 commit
- [s2s] test_distributed_eval (#8315) · d787935a
  Stas Bekman authored Nov 05, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
  d787935a
28 Oct, 2020 1 commit

[testing] port test_trainer_distributed to distributed pytest + TestCasePlus enhancements (#8107) · 5423f2a9

Stas Bekman authored Oct 28, 2020



* move the helper code into testing_utils

* port test_trainer_distributed to work with pytest

* improve docs

* simplify notes

* doc

* doc

* style

* doc

* further improvements

* torch might not be available

* real fix

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5423f2a9

26 Oct, 2020 1 commit

[Seq2Seq Trainer] Make sure padding is implemented for models without pad_token (#8043) · 664c7ec4

Patrick von Platen authored Oct 26, 2020

* make sure padding is implemented for non-padding tokens models as well

* add better error message

* add better warning

* remove results files

* Update examples/seq2seq/seq2seq_trainer.py

* remove unnecessary copy line

* correct usage of labels

* delete test files

664c7ec4

23 Oct, 2020 1 commit

[Examples] Allow EncoderDecoderModels to be trained with Seq2Seq (#7809) · 3c682ea1

Patrick von Platen authored Oct 23, 2020

* Make Seq2Seq Trainer more similar to Trainer

* fix typo

* fix seq2seq trainer

* remove from tests

* remove lock

* remove train files

* delete test files

* correct typo

* check at init

* make sure trainer is not slowed down on TPU

* correct isort

* remove use cache

* fix use cache

* add last use chache = false

3c682ea1

22 Oct, 2020 1 commit
- [s2s trainer] tests to use distributed on multi-gpu machine (#7965) · 023f0f37
  Stas Bekman authored Oct 22, 2020
  
  023f0f37