Commits · d9a81fc0c5d8339357a42435009a5be3a190b305 · chenpangpang / transformers

"...research_projects/seq2seq-distillation/utils copy.py" did not exist on "de9e2979647338bc9617dae68c5e9dccc413fb9f"

18 Feb, 2021 2 commits

[Trainer] memory tracker metrics (#10225) · 97e688bc

Stas Bekman authored Feb 18, 2021



* memory tracker metrics

* go back to eval for somewhat consistency

* handle no-gpu case

* deal with stackable eval calls

* restore callback order

* style

* simplify the API

* add test

* docs

* consistently use eval_ prefix

* improve docs

* Update src/transformers/trainer_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* rename method

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

97e688bc

Introduce warmup_ratio training argument (#10229) · d7f38c5d

Tanmay Garg authored Feb 18, 2021

Introduce warmup_ratio training argument in both
TrainingArguments and TFTrainingArguments classes (#6673)

d7f38c5d

11 Feb, 2021 2 commits

Add SageMakerTrainer for model paralellism (#10122) · 31245775

Sylvain Gugger authored Feb 11, 2021

* Refactor things out of main train

* Store signature

* Add SageMakerTrainer

* Init + Copyright

* Address review comments

31245775

[DeepSpeed in notebooks] Jupyter + Colab (#10130) · b54cb0bd

Stas Bekman authored Feb 11, 2021

* init devices/setup explicitly

* docs + test

* simplify

* cleanup

* cleanup

* cleanup

* correct the required dist setup

* derive local_rank from env LOCAL_RANK

b54cb0bd

09 Feb, 2021 1 commit
- Fix some edge cases in report_to and add deprecation warnings (#10100) · 77c0ce8c
  Sylvain Gugger authored Feb 09, 2021
  
  77c0ce8c
31 Jan, 2021 1 commit

Clarify definition of seed argument in TrainingArguments (#9903) · 22121e81

lewtun authored Jan 31, 2021



* Clarify definition of seed argument in Trainer

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/training_args_tf.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

22121e81

29 Jan, 2021 2 commits
- refactor deepspeed setup devices (#9880) · 1420b5ff
  Stas Bekman authored Jan 29, 2021
  
  1420b5ff
- When on sagemaker use their env variables for saves (#9876) · 7eadfe16
  Sylvain Gugger authored Jan 29, 2021
```
* When on sagemaker use their env variables for saves

* Address review comments

* Quality
```
  7eadfe16
28 Jan, 2021 2 commits
- pin_memory -> dataloader_pin_memory (#9874) · bc109ae5
  abhishek thakur authored Jan 28, 2021
  
  bc109ae5
- Pin memory in Trainer by default (#9857) · 25fcb5c1
  abhishek thakur authored Jan 28, 2021
```
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
```
  25fcb5c1
27 Jan, 2021 1 commit

Add a flag for find_unused_parameters (#9820) · c7b7bd99

Sylvain Gugger authored Jan 27, 2021



* Add a flag for find_unused_parameters

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Remove negation
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

c7b7bd99

26 Jan, 2021 1 commit

Smdistributed trainer (#9798) · 0d0efd3a

Sylvain Gugger authored Jan 26, 2021

* Add a debug print

* Adapt Trainer to use smdistributed if available

* Forgotten parenthesis

* Real check for sagemaker

* Donforget to define device...

* Woopsie, local)rank is defined differently

* Update since local_rank has the proper value

* Remove debug statement

* More robust check for smdistributed

* Quality

* Deal with key not present error

0d0efd3a

22 Jan, 2021 1 commit
- Add `report_to` training arguments to control the reporting integrations used (#9735) · 82d46feb
  Sylvain Gugger authored Jan 22, 2021
  
  82d46feb
20 Jan, 2021 2 commits

Fix style · 2a703773
Sylvain Gugger authored Jan 20, 2021

2a703773

Fix Trainer and Args to mention AdamW, not Adam. (#9685) · 538245b0

Gunjan Chhablani authored Jan 20, 2021

* Fix Trainer and Args to mention AdamW, not Adam.

* Update the docs for Training Arguments.

* Change arguments adamw_* to adam_*

* Fixed links to AdamW in TrainerArguments docs

* Fix line length in Training Args docs.

538245b0

14 Jan, 2021 2 commits

Upstream (and rename) sortish sampler (#9574) · 329fe274

Sylvain Gugger authored Jan 14, 2021



* Upstream (and rename) sortish sampler

* Use proper sampler

* Update src/transformers/trainer_pt_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

329fe274

Fix Trainer with a parallel model (#9578) · 5e1bea4f
Sylvain Gugger authored Jan 14, 2021
```
* Fix Trainer with a parallel model

* More clean up
```
5e1bea4f

13 Jan, 2021 2 commits

Fix data parallelism in Trainer (#9566) · 04dc65e5

Sylvain Gugger authored Jan 13, 2021



* Fix data parallelism in Trainer

* Update src/transformers/training_args.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

04dc65e5

[trainer] deepspeed integration (#9211) · 2df34f4a

Stas Bekman authored Jan 12, 2021



* deepspeed integration

* style

* add test

* ds wants to do its own backward

* fp16 assert

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

* for clarity extract what args are being passed to deepspeed

* introduce the concept of self.wrapped_model

* s/self.wrapped_model/self.model_wrapped/

* complete transition to self.wrapped_model / self.model

* fix

* doc

* give ds its own init

* add custom overrides, handle bs correctly

* fix test

* clean up model_init logic, fix small bug

* complete fix

* collapse --deepspeed_config into --deepspeed

* style

* start adding doc notes

* style

* implement hf2ds optimizer and scheduler configuration remapping

* oops

* call get_num_training_steps absolutely when needed

* workaround broken auto-formatter

* deepspeed_config arg is no longer needed - fixed in deepspeed master

* use hf's fp16 args in config

* clean

* start on the docs

* rebase cleanup

* finish up --fp16

* clarify the supported stages

* big refactor thanks to discovering deepspeed.init_distributed

* cleanup

* revert fp16 part

* add checkpoint-support

* more init ds into integrations

* extend docs

* cleanup

* unfix docs

* clean up old code

* imports

* move docs

* fix logic

* make it clear which file it's referring to

* document nodes/gpus

* style

* wrong format

* style

* deepspeed handles gradient clipping

* easier to read

* major doc rewrite

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* docs

* switch to AdamW optimizer

* style

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* clarify doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

2df34f4a

11 Jan, 2021 1 commit

[trainer] remove `--model_parallel` (#9451) · 33b74228

Stas Bekman authored Jan 11, 2021



* fix bad merge - dropped code

* remove --model_parallel

* Deal with TrainingArguments

* Use a private attr and fix batch sizes

* fix _n_gpu

* add is_parallel helper wrapper

* fix attribute

* introduce a new attribute is_model_parallel

* docs

* docs

* Put back init False and rearrange doc

* Ignore non-init args in HFArgumentParser
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

33b74228

05 Jan, 2021 2 commits

[trainer] group fp16 args together (#9409) · 29acabd8

Stas Bekman authored Jan 05, 2021

* [t5 doc] typos

a few run away backticks

@sgugger

* style

* [trainer] put fp16 args together

this PR proposes a purely cosmetic change that puts all the fp16 args together - so they are easier to manager/read

@sgugger

* style

29acabd8

[trainer] --model_parallel hasn't been implemented for most models (#9347) · 748006c0

Stas Bekman authored Jan 05, 2021

* --model_parallel hasn't been implemented for most models

* make the help clear as well

* implement is_parallelizable; use it

* oops

* remove property

748006c0

22 Dec, 2020 1 commit

Seq2seq trainer (#9241) · 490b39e6

Sylvain Gugger authored Dec 22, 2020



* Add label smoothing in Trainer

* Add options for scheduler and Adafactor in Trainer

* Put Seq2SeqTrainer in the main lib

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments and adapt scripts

* Documentation

* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

490b39e6

18 Dec, 2020 1 commit
- Add timing inside Trainer (#9196) · 1198ba8f
  Sylvain Gugger authored Dec 18, 2020
```
* Add timing inside Trainer

* Fix tests

* Add n_objs for train

* Sort logs
```
  1198ba8f
16 Dec, 2020 1 commit

Experimental support for fairscale ShardedDDP (#9139) · 9a671853

Sylvain Gugger authored Dec 16, 2020

* Experimental stupport for fairscale ShardedDDP

* Add import error if fairscale not available

* Address review comments

* Fix seq2seq trainer

9a671853

15 Dec, 2020 3 commits

Fix fp16_backend field · 51adb97c
Sylvain Gugger authored Dec 15, 2020

51adb97c

Add possibility to switch between APEX and AMP in Trainer (#9137) · ad895af9

Sylvain Gugger authored Dec 15, 2020



* Add possibility to switch between APEX and AMP in Trainer

* Update src/transformers/training_args.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments

* Update src/transformers/training_args.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

ad895af9

Clarify use of TrainingArguments.disable_tqdm in Jupyter Notebooks (#9076) · ed1845ef

lewtun authored Dec 15, 2020



* Clarify impact of disable_tqdm on Jupyter Notebooks

* Add weblink to argparse

* Replace "dev set" with more common "validation set" in do_eval

* Tweak prediction_loss_only

* Tweak description of Adam hyperparameters

* Add weblink to TensorBoard

* Capitalise apex

* Tweak local_rank description

* Add weblink for wandb

* Replace nlp with datasets

* Tweak grammar in model_parallel

* Capitalise apex

* Update TensorFlow training args to match PyTorch ones

* Fix style

* Fix underscore in weblink
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix underscore in weblink
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix underscore in weblink
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix underscore in weblink
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add obj to datasets.Dataset
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ed1845ef

14 Dec, 2020 1 commit
- correct var name in TrainingArguments docstring (#9096) · d6af344c
  Navjot authored Dec 14, 2020
  
  d6af344c
07 Dec, 2020 1 commit
- Copyright (#8970) · 00aa9dbc
  Sylvain Gugger authored Dec 07, 2020
```
* Add copyright everywhere missing

* Style
```
  00aa9dbc
01 Dec, 2020 2 commits
- Add a `parallel_mode` property to TrainingArguments (#8877) · b08843cf
  Sylvain Gugger authored Dec 01, 2020
```
* Add a `distributed_env` property to TrainingArguments

* Change name

* Address comment
```
  b08843cf
- Better support for resuming training (#8878) · 7c10dd22
  Sylvain Gugger authored Dec 01, 2020
  
  7c10dd22
23 Nov, 2020 2 commits

Document new training argument · 49759c0c
Sylvain Gugger authored Nov 23, 2020

49759c0c

gpt2 and t5 parallel modeling (#8696) · 1cd9be2a

alexorona authored Nov 23, 2020



* gpt2 and t5 parallel modeling

* model_parallel utils update

* adding missing model_parallel_utils

Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5

* training_args reformat

Reformatted training_args

* style formatting

Style formatting doc string length on training_args and model_parallel_utils

* style changes

make style && make quality for training_args and model_parallel_utils.

* adding tests

* minor change in trainer

reverts loss calculation

* Update training_args.py

* Update training_args.py

added back docstring language for adam_beta1 and adam_beta2

* Update trainer.py

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style & rebase
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

1cd9be2a

20 Nov, 2020 1 commit
- Document adam betas TrainingArguments (#8688) · 63e91f5f
  Sylvain Gugger authored Nov 20, 2020
  
  63e91f5f
17 Nov, 2020 1 commit

Remove deprecated (#8604) · dd52804f

Sylvain Gugger authored Nov 17, 2020



* Remove old deprecated arguments
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* Remove needless imports

* Fix tests
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

dd52804f

03 Nov, 2020 1 commit

improve documentation of training_args.py (#8270) · 6a064447

Philip May authored Nov 03, 2020



* improve documentation of training_args.py

- do_train
- do_eval
- do_predict

* fix line too long

* fix style with black on training_args.py

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix line length with utils/style_doc

* black reformatting
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

6a064447

30 Oct, 2020 1 commit

Fix two bugs with --logging_first_step (#8193) · 8f1c960e

Abi See authored Oct 30, 2020

* make sure that logging_first_step evaluates

* fix bug with incorrect loss on logging_first_step

* fix style

* logging_first_step only logs, not evals

8f1c960e

29 Oct, 2020 1 commit

Fix doc errors and typos across the board (#8139) · 969859d5

Santiago Castro authored Oct 29, 2020

* Fix doc errors and typos across the board

* Fix a typo

* Fix the CI

* Fix more typos

* Fix CI

* More fixes

* Fix CI

* More fixes

* More fixes

969859d5

27 Oct, 2020 1 commit
- Doc styling fixes (#8074) · c42596bc
  Sylvain Gugger authored Oct 27, 2020
```
* Fix a few docstrings

* More fixes

* Styling
```
  c42596bc