Commits · 3212b8850d20f5adf6242074193854cf1f808c86 · chenpangpang / transformers

30 Jul, 2020 1 commit
- [s2s] add support for overriding config params (#6149) · 3212b885
  Stas Bekman authored Jul 29, 2020
  
  3212b885
29 Jul, 2020 2 commits

Julien Plu authored Jul 29, 2020

* Fully rework training/prediction loops

* fix method name

* Fix variable name

* Fix property name

* Fix scope

* Fix method name

* Fix tuple index

* Fix tuple index

* Fix indentation

* Fix variable name

* fix eval before log

* Add drop remainder for test dataset

* Fix step number + fix logging datetime

* fix eval loss value

* use global step instead of step + fix logging at step 0

* Fix logging datetime

* Fix global_step usage

* Fix breaking loop + logging datetime

* Fix step in prediction loop

* Fix step breaking

* Fix train/test loops

* Force TF at least 2.2 for the trainer

* Use assert_cardinality to facilitate the dataset size computation

* Log steps per epoch

* Make tfds compliant with TPU

* Make tfds compliant with TPU

* Use TF dataset enumerate instead of the Python one

* revert previous commit

* Fix data_dir

* Apply style

* rebase on master

* Address Sylvain's comments

* Address Sylvain's and Lysandre comments

* Trigger CI

* Remove unused import

54f9fbef

XLNet PLM Readme (#6121) · 641b873c
Lysandre Debut authored Jul 29, 2020

641b873c

28 Jul, 2020 5 commits
- Fix deebert tests (#6102) · 92f8ce2e
  Sam Shleifer authored Jul 28, 2020
  
  92f8ce2e
- [s2s] Delete useless method, log tokens_per_batch (#6081) · dafa296c
  Sam Shleifer authored Jul 28, 2020
  
  dafa296c
- link to README.md (#6068) · f0c70085
  Stas Bekman authored Jul 28, 2020
```
* add a link to README.md

* Update README.md
```
  f0c70085
- MBART: support summarization tasks where max_src_len > max_tgt_len (#6003) · 3c7fbf35
  Sam Shleifer authored Jul 28, 2020
```
* MBART: support summarization tasks

* fix test

* Style

* add tokenizer test
```
  3c7fbf35
- [s2s] Don't mention packed data in README (#6079) · 7a68d401
  Sam Shleifer authored Jul 27, 2020
  
  7a68d401
27 Jul, 2020 4 commits
- [s2s] dont document packing because it hurts performance (#6077) · 1e00ef68
  Sam Shleifer authored Jul 27, 2020
  
  1e00ef68
- CL util to convert models to fp16 before upload (#5953) · 11792d78
  Sam Shleifer authored Jul 27, 2020
  
  11792d78
- [pack_dataset] don't sort before packing, only pack train (#5954) · 4302ace5
  Sam Shleifer authored Jul 27, 2020
  
  4302ace5
- [examples (seq2seq)] fix preparing decoder_input_ids for T5 (#5994) · d1d15d6f
  Suraj Patil authored Jul 27, 2020
  
  d1d15d6f
24 Jul, 2020 1 commit
- [CI] Don't test apex (#6021) · c69ea5ef
  Sam Shleifer authored Jul 24, 2020
  
  c69ea5ef
22 Jul, 2020 2 commits
- [test] partial coverage for train_mbart_enro_cc25.sh (#5976) · c3206eef
  Sam Shleifer authored Jul 22, 2020
  
  c3206eef
- [docs] Add integration test example to copy pasta template (#5961) · feeb956a
  Sam Shleifer authored Jul 22, 2020
```
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
```
  feeb956a
21 Jul, 2020 4 commits
- seq2seq/run_eval.py can take decoder_start_token_id (#5949) · 9dab39fe
  Sam Shleifer authored Jul 21, 2020
  
  9dab39fe
- [examples/seq2seq]: add --label_smoothing option (#5919) · 5b193b39
  Sam Shleifer authored Jul 21, 2020
  
  5b193b39
- [Doc] explaining romanian postprocessing for MBART BLEU hacking (#5943) · 95d1962b
  Sam Shleifer authored Jul 21, 2020
  
  95d1962b
- typos in seq2seq/readme (#5937) · ccbf74a6
  Aditya Soni authored Jul 21, 2020
  
  ccbf74a6
20 Jul, 2020 3 commits

DataParallel fix: multi gpu evaluation (#5926) · 8e0bcb56

Qingqing Cao authored Jul 20, 2020

The DataParallel training was fixed in https://github.com/huggingface/transformers/pull/5733, this commit also fixes the evaluation. It's more convenient when the user enables both `do_train` and `do_eval`.

8e0bcb56

[Fix] seq2seq pack_dataset.py actually packs (#5913) · f1a4e06f
Sam Shleifer authored Jul 20, 2020
```
Huge MT speedup!
```
f1a4e06f

DataParallel fixes (#5733) · 35cb101e

Stas Bekman authored Jul 20, 2020

* DataParallel fixes:

1. switched to a more precise check
-        if self.args.n_gpu > 1:
+        if isinstance(model, nn.DataParallel):

2. fix tests - require the same fixup under DataParallel as the training module

* another fix

35cb101e

18 Jul, 2020 4 commits
- Seq2SeqDataset uses linecache to save memory by @Pradhy729 (#5792) · 09a2f406
  Sam Shleifer authored Jul 18, 2020
```
Co-authored-by: Pradhy729 <49659913+Pradhy729@users.noreply.github.com>
```
  09a2f406
- [seq2seq] distillation.py accepts trainer arguments (#5865) · dad5e12e
  Sam Shleifer authored Jul 18, 2020
  
  dad5e12e
- [seq2seq] MAX_LEN env var for MT commands (#5837) · ba240018
  Sam Shleifer authored Jul 17, 2020
  
  ba240018
- Lightning Updates for v0.8.5 (#5798) · 529850ae
  Nathan Raw authored Jul 17, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
  529850ae
17 Jul, 2020 1 commit
- [seq2seq] Don't copy self.source in sortishsampler (#5818) · e238e3d5
  Sam Shleifer authored Jul 17, 2020
  
  e238e3d5
16 Jul, 2020 1 commit
- [seq2seq] pack_dataset.py rewrites dataset in max_tokens format (#5819) · 283500ff
  Sam Shleifer authored Jul 16, 2020
  
  283500ff
15 Jul, 2020 2 commits
- [fix] check code quality (#5772) · 1a647abf
  Sam Shleifer authored Jul 15, 2020
  
  1a647abf
- [cleanup] T5 test, warnings (#5761) · d0486c8b
  Sam Shleifer authored Jul 15, 2020
  
  d0486c8b
14 Jul, 2020 1 commit

docs(wandb): explain how to use W&B integration (#5607) · 4d5a8d65

Boris Dayma authored Jul 14, 2020



* docs(wandb): explain how to use W&B integration

fix #5262

* Also mention TensorBoard
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

4d5a8d65

10 Jul, 2020 1 commit

Update The Big Table of Tasks · 201d23f2

Julien Chaumond authored Jul 10, 2020


Co-Authored-By: Suraj Patil <surajp815@gmail.com>
Co-Authored-By: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

201d23f2

09 Jul, 2020 1 commit

Test XLA examples (#5583) · 0533cf47

Lysandre Debut authored Jul 09, 2020

* Test XLA examples

* Style

* Using `require_torch_tpu`

* Style

* No need for pytest

0533cf47

08 Jul, 2020 1 commit

Add DeeBERT (entropy-based early exiting for *BERT) (#5477) · cfbb9829

Ji Xin authored Jul 07, 2020

* Add deebert code

* Add readme of deebert

* Add test for deebert

Update test for Deebert

* Update DeeBert (README, class names, function refactoring); remove requirements.txt

* Format update

* Update test

* Update readme and model init methods

cfbb9829

07 Jul, 2020 5 commits

readme for benchmark (#5363) · fde217c6
Patrick von Platen authored Jul 07, 2020

fde217c6

Add mbart-large-cc25, support translation finetuning (#5129) · 353b8f1e

Sam Shleifer authored Jul 07, 2020

improve unittests for finetuning, especially w.r.t testing frozen parameters
fix freeze_embeds for T5
add streamlit setup.cfg

353b8f1e

[Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming... · 4dc65591

Patrick von Platen authored Jul 07, 2020


[Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming and keras compile (#5395)

* add first version of clm tf

* make style

* add more tests for bert

* update tf clm loss

* fix tests

* correct tf ner script

* add mlm loss

* delete bogus file

* clean tf auto model + add tests

* finish adding clm loss everywhere

* fix training in distilbert

* fix flake8

* save intermediate

* fix tf t5 naming

* remove prints

* finish up

* up

* fix tf gpt2

* fix new test utils import

* fix flake8

* keep backward compatibility

* Update src/transformers/modeling_tf_albert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_roberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_mobilebert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_distilbert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply sylvains suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

4dc65591

[examples] Add trainer support for question-answering (#4829) · e49393c3

Suraj Patil authored Jul 07, 2020



* add SquadDataset

* add DataCollatorForQuestionAnswering

* update __init__

* add run_squad with  trainer

* add DataCollatorForQuestionAnswering in __init__

* pass data_collator to trainer

* doc tweak

* Update run_squad_trainer.py

* Update __init__.py

* Update __init__.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

e49393c3

Added data collator for permutation (XLNet) language modeling and related calls (#5522) · 3dcb748e

Shashank Gupta authored Jul 07, 2020

* Added data collator for XLNet language modeling and related calls

Added DataCollatorForXLNetLanguageModeling in data/data_collator.py
to generate necessary inputs for language modeling training with
XLNetLMHeadModel. Also added related arguments, logic and calls in
examples/language-modeling/run_language_modeling.py.

Resolves: #4739, #2008 (partially)

* Changed name to `DataCollatorForPermutationLanguageModeling`

Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`.
Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use.
CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of
similar to `mems` for XLNet).
Changed calls and imports appropriately.

* Added detailed comments, changed variable names

Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative.

* Added tests for new data collator

Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences.

* Fixed styling issues

3dcb748e

06 Jul, 2020 1 commit
- The `add_space_before_punct_symbol` is only for TransfoXL (#5549) · 9d9b872b
  Lysandre Debut authored Jul 06, 2020
  
  9d9b872b