Commits · f991daed185261085301d72c2cd634836df1044a · chenpangpang / transformers

22 Feb, 2021 1 commit
- defensive programming + expand/correct README (#10295) · f991daed
  Stas Bekman authored Feb 22, 2021
  
  f991daed
18 Feb, 2021 1 commit

[Trainer] memory tracker metrics (#10225) · 97e688bc

Stas Bekman authored Feb 18, 2021



* memory tracker metrics

* go back to eval for somewhat consistency

* handle no-gpu case

* deal with stackable eval calls

* restore callback order

* style

* simplify the API

* add test

* docs

* consistently use eval_ prefix

* improve docs

* Update src/transformers/trainer_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* rename method

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

97e688bc

16 Feb, 2021 1 commit
- set tgt_lang of MBart Tokenizer for summarization (#10205) · df1b0fb5
  Zhang Cheng authored Feb 16, 2021
  
  df1b0fb5
15 Feb, 2021 2 commits

[WIP][examples/seq2seq] move old s2s scripts to legacy (#10136) · 1c8c2d9a

Suraj Patil authored Feb 16, 2021



* move old s2s scripts to legacy

* add the tests back

* proper rename

* restore

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1c8c2d9a

fix run_seq2seq.py; porting trainer tests to it (#10162) · 0b1f552a

Stas Bekman authored Feb 15, 2021

* fix run_seq2seq.py; porting DeepSpeed tests to it

* unrefactor

* defensive programming

* defensive programming 2

* port the rest of the trainer tests

* style

* a cleaner scripts dir finder

* cleanup

0b1f552a

12 Feb, 2021 1 commit
- [examples/run_s2s] remove task_specific_params and update rouge computation (#10133) · f51188cb
  Suraj Patil authored Feb 12, 2021
```
* fix rouge metrics and task specific params

* fix typo

* round metrics

* typo

* remove task_specific_params
```
  f51188cb
09 Feb, 2021 1 commit
- [examples/s2s] add test set predictions (#10085) · 63fddcf6
  Suraj Patil authored Feb 09, 2021
```
* add do_predict, pass eval_beams durig eval

* update help

* apply suggestions from code review
```
  63fddcf6
08 Feb, 2021 5 commits
- transition to new tests dir (#10080) · 781220ac
  Stas Bekman authored Feb 08, 2021
  
  781220ac
- [trainer] deepspeed bug fixes and tests (#10039) · 322037e8
  Stas Bekman authored Feb 08, 2021
```
* deepspeed bug fixes and tests

* manual wrap?
```
  322037e8
- [s2s examples] Replace -100 token ids with the tokenizer pad_id for compute_metrics (#10046) · ece6c514
  Olivier authored Feb 08, 2021
```
* replace -100 token ids with the tokenizer pad_id for compute_metrics

* fixed typo for label_ids
```
  ece6c514
- Can't mix --fp16 and --device cpu (#10041) · 24db8cc3
  Stas Bekman authored Feb 07, 2021
  
  24db8cc3
- json to jsonlines, and doc, and typo (#10043) · 769948fa
  Stas Bekman authored Feb 07, 2021
  
  769948fa
05 Feb, 2021 2 commits

[examples] make run scripts executable (#10037) · 8ea412a8
Stas Bekman authored Feb 05, 2021
```
* make executable

* make executable

* same for the template

* cleanup
```
8ea412a8

[examples/seq2seq] support label smoothing (#9844) · 1cd16512

Suraj Patil authored Feb 05, 2021

* add prepare_decoder_input_ids_from_labels in s2s models

* support lbl smoothing and enc/emb freezing

* fix freezing

* use pad_token_id from config

* remove embed freezing and add warning

* prepare decoder_input_ids inside DataCollatorForSeq2Seq

1cd16512

01 Feb, 2021 2 commits
- Remove subclass for sortish sampler (#9907) · 115d97dd
  Sylvain Gugger authored Feb 01, 2021
```
* Remove subclass for sortish sampler

* Use old Seq2SeqTrainer in script

* Styling
```
  115d97dd
- fix logger format for non-main process (#9911) · 6bab8368
  Stas Bekman authored Feb 01, 2021
  
  6bab8368
29 Jan, 2021 1 commit
- correctly handle mt5 (#9879) · 6bf94bc0
  Stas Bekman authored Jan 29, 2021
  
  6bf94bc0
28 Jan, 2021 1 commit
- Deprecate model_path in Trainer.train (#9854) · b4e559cf
  Sylvain Gugger authored Jan 28, 2021
  
  b4e559cf
27 Jan, 2021 1 commit
- Setup logging with a stdout handler (#9816) · f2fabedb
  Sylvain Gugger authored Jan 27, 2021
  
  f2fabedb
26 Jan, 2021 2 commits

Fix fine-tuning translation scripts (#9809) · 8f6c12d3
Magdalena Biesialska authored Jan 26, 2021

8f6c12d3

Improve pytorch examples for fp16 (#9796) · 10e5f282

Andrea Cappelli authored Jan 26, 2021



* Pad to 8x for fp16 multiple choice example (#9752)

* Pad to 8x for fp16 squad trainer example (#9752)

* Pad to 8x for fp16 ner example (#9752)

* Pad to 8x for fp16 swag example (#9752)

* Pad to 8x for fp16 qa beam search example (#9752)

* Pad to 8x for fp16 qa example (#9752)

* Pad to 8x for fp16 seq2seq example (#9752)

* Pad to 8x for fp16 glue example (#9752)

* Pad to 8x for fp16 new ner example (#9752)

* update script template #9752

* Update examples/multiple-choice/run_swag.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/question-answering/run_qa.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/question-answering/run_qa_beam_search.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve code quality #9752
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

10e5f282

25 Jan, 2021 1 commit

Auto-resume training from checkpoint (#9776) · caf4abf7

Sylvain Gugger authored Jan 25, 2021



* Auto-resume training from checkpoint

* Update examples/text-classification/run_glue.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Roll out to other examples
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

caf4abf7

22 Jan, 2021 1 commit
- Fixes to run_seq2seq and instructions (#9734) · 411c5821
  Sylvain Gugger authored Jan 22, 2021
```
* Fixes to run_seq2seq and instructions

* Add more defaults for summarization
```
  411c5821
21 Jan, 2021 1 commit

Fix memory regression in Seq2Seq example (#9713) · 5f80c15e

Sylvain Gugger authored Jan 21, 2021

* Fix memory regression in Seq2Seq example

* Fix test and properly deal with -100

* Easier condition with device safety

* Patch for MBartTokenzierFast

5f80c15e

19 Jan, 2021 2 commits

New run_seq2seq script (#9605) · e4c06ed6

Sylvain Gugger authored Jan 19, 2021



* New run_seq2seq script

* Add tests

* Mark as slow

* Update examples/seq2seq/run_seq2seq.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/data/data_collator.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/data/data_collator.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Address review comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

e4c06ed6

Fix old Seq2SeqTrainer (#9675) · 97b787fb
Sylvain Gugger authored Jan 19, 2021

97b787fb

15 Jan, 2021 1 commit
- deepspeed + grad acumm (#9622) · c60e0e1e
  Stas Bekman authored Jan 15, 2021
  
  c60e0e1e
14 Jan, 2021 1 commit

Upstream (and rename) sortish sampler (#9574) · 329fe274

Sylvain Gugger authored Jan 14, 2021



* Upstream (and rename) sortish sampler

* Use proper sampler

* Update src/transformers/trainer_pt_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

329fe274

13 Jan, 2021 1 commit

[trainer] deepspeed integration (#9211) · 2df34f4a

Stas Bekman authored Jan 12, 2021



* deepspeed integration

* style

* add test

* ds wants to do its own backward

* fp16 assert

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

* for clarity extract what args are being passed to deepspeed

* introduce the concept of self.wrapped_model

* s/self.wrapped_model/self.model_wrapped/

* complete transition to self.wrapped_model / self.model

* fix

* doc

* give ds its own init

* add custom overrides, handle bs correctly

* fix test

* clean up model_init logic, fix small bug

* complete fix

* collapse --deepspeed_config into --deepspeed

* style

* start adding doc notes

* style

* implement hf2ds optimizer and scheduler configuration remapping

* oops

* call get_num_training_steps absolutely when needed

* workaround broken auto-formatter

* deepspeed_config arg is no longer needed - fixed in deepspeed master

* use hf's fp16 args in config

* clean

* start on the docs

* rebase cleanup

* finish up --fp16

* clarify the supported stages

* big refactor thanks to discovering deepspeed.init_distributed

* cleanup

* revert fp16 part

* add checkpoint-support

* more init ds into integrations

* extend docs

* cleanup

* unfix docs

* clean up old code

* imports

* move docs

* fix logic

* make it clear which file it's referring to

* document nodes/gpus

* style

* wrong format

* style

* deepspeed handles gradient clipping

* easier to read

* major doc rewrite

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* docs

* switch to AdamW optimizer

* style

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* clarify doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

2df34f4a

05 Jan, 2021 1 commit

[PyTorch Bart] Split Bart into different models (#9343) · eef66035

Patrick von Platen authored Jan 05, 2021

* first try

* remove old template

* finish bart

* finish mbart

* delete unnecessary line

* init pegasus

* save intermediate

* correct pegasus

* finish pegasus

* remove cookie cutter leftover

* add marian

* finish blenderbot

* replace in file

* correctly split blenderbot

* delete "old" folder

* correct "add statement"

* adapt config for tf comp

* correct configs for tf

* remove ipdb

* fix more stuff

* fix mbart

* push pegasus fix

* fix mbart

* more fixes

* fix research projects code

* finish docs for bart, mbart, and marian

* delete unnecessary file

* correct attn typo

* correct configs

* remove pegasus for seq class

* correct peg docs

* correct peg docs

* finish configs

* further improve docs

* add copied from statements to mbart

* fix copied from in mbart

* add copy statements to marian

* add copied from to marian

* add pegasus copied from

* finish pegasus

* finish copied from

* Apply suggestions from code review

* make style

* backward comp blenderbot

* apply lysandres and sylvains suggestions

* apply suggestions

* push last fixes

* fix docs

* fix tok tests

* fix imports code style

* fix doc

eef66035

23 Dec, 2020 1 commit
- Adapt to new name of `label_smoothing_factor` training arg (#9282) · a1cb6e98
  Sylvain Gugger authored Dec 23, 2020
  
  a1cb6e98
22 Dec, 2020 3 commits

Revert renaming in finetune_trainer (#9262) · e6c1f1ca
Sylvain Gugger authored Dec 22, 2020

e6c1f1ca
Fix link to bertabs/README.md (#9255) · 37d6fb5d
Manuel Romero authored Dec 22, 2020

37d6fb5d

Seq2seq trainer (#9241) · 490b39e6

Sylvain Gugger authored Dec 22, 2020



* Add label smoothing in Trainer

* Add options for scheduler and Adafactor in Trainer

* Put Seq2SeqTrainer in the main lib

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments and adapt scripts

* Documentation

* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

490b39e6

20 Dec, 2020 1 commit
- better logging and help (#9203) · f38c4ad3
  Stas Bekman authored Dec 20, 2020
  
  f38c4ad3
18 Dec, 2020 2 commits
- Add timing inside Trainer (#9196) · 1198ba8f
  Sylvain Gugger authored Dec 18, 2020
```
* Add timing inside Trainer

* Fix tests

* Add n_objs for train

* Sort logs
```
  1198ba8f
- [trainer] apex fixes and tests (#9180) · f06d0fad
  Stas Bekman authored Dec 17, 2020
  
  f06d0fad
17 Dec, 2020 1 commit
- add tests for the new sharded ddp fairscale integration (#9177) · 63841c55
  Stas Bekman authored Dec 17, 2020
  
  63841c55
16 Dec, 2020 1 commit

Experimental support for fairscale ShardedDDP (#9139) · 9a671853

Sylvain Gugger authored Dec 16, 2020

* Experimental stupport for fairscale ShardedDDP

* Add import error if fairscale not available

* Address review comments

* Fix seq2seq trainer

9a671853

15 Dec, 2020 1 commit
- native amp leak fix landed in 1.7.1 (#9115) · 14c79c3e
  Stas Bekman authored Dec 15, 2020
```
update README with good news that the leak fix has been applied to pytorch-1.7.1.
```
  14c79c3e