Commits · d996024af7a94b0f11a5ad351217b648ecaed72a · chenpangpang / transformers

02 Feb, 2021 6 commits
- Use compute_loss in prediction_step (#9935) · d996024a
  Sylvain Gugger authored Feb 02, 2021
  
  d996024a
- convbert: minor fixes for conversion script (#9937) · aa438a42
  Stefan Schweter authored Feb 02, 2021
  
  aa438a42
- Bump numpy (#9934) · 62024453
  Sylvain Gugger authored Feb 02, 2021
  
  62024453
- Fix 9918 (#9932) · de38a6e4
  Sylvain Gugger authored Feb 02, 2021
```
* Initial work

* Fix doc styler and other models
```
  de38a6e4
- fix typo in naming (#9944) · 0f4dc5d8
  Patrick von Platen authored Feb 02, 2021
  
  0f4dc5d8
- [Tokenizer Utils Base] Make pad function more flexible (#9928) · 538b3b46
  Patrick von Platen authored Feb 02, 2021
```
* change tokenizer requirement

* split line

* Correct typo from list to str

* improve style

* make other function pretty as well

* add comment

* correct typo

* add new test

* pass tests for tok without padding token

* Apply suggestions from code review
```
  538b3b46
01 Feb, 2021 8 commits

Tensorflow doc changes on loss output size (#9922) · d1b14c9b

Jan Jitse Venselaar authored Feb 01, 2021

* Change documentation to correctly specify loss tensor size

* Change documentation to correct input format for labels

* Corrected output size of loss tensor for sequence classifier, multiple choice model and question answering

d1b14c9b

Fix bart conversion script (#9923) · 343057e1
Suraj Patil authored Feb 01, 2021
```
* fix conversion script

* typo

* import nn
```
343057e1
fix typos (#9924) · 0842c33e
Suraj Patil authored Feb 01, 2021

0842c33e

Adafactor: avoid updating group["lr"] attributes (#9751) · 8672bcda

CeShine Lee authored Feb 01, 2021

This affects Adafactor with relative_step=False and scale_parameter=True.
Updating group["lr"] makes the result of ._get_lr() depends on the previous call,
i.e., on the scale of other parameters. This isn't supposed to happen.

8672bcda

Remove subclass for sortish sampler (#9907) · 115d97dd
Sylvain Gugger authored Feb 01, 2021
```
* Remove subclass for sortish sampler

* Use old Seq2SeqTrainer in script

* Styling
```
115d97dd

Fit chinese wwm to new datasets (#9887) · 1682804e

wlhgtc authored Feb 01, 2021



* MOD: fit chinese wwm to new datasets

* MOD: move wwm to new folder

* MOD: formate code

* Styling

* MOD add param and recover trainer
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

1682804e

[wandb] restore WANDB_DISABLED=true to disable wandb (#9896) · 24881008

Stas Bekman authored Feb 01, 2021

* [t5 doc] typos

a few run away backticks

@sgugger

* style

* [trainer] put fp16 args together

this PR proposes a purely cosmetic change that puts all the fp16 args together - so they are easier to manager/read

@sgugger

* style

* [wandb] make WANDB_DISABLED disable wandb with any value

This PR solves part of https://github.com/huggingface/transformers/issues/9623

It tries to actually do what https://github.com/huggingface/transformers/issues/9699 requested/discussed and that is any value of `WANDB_DISABLED` should disable wandb.

The current behavior is that it has to be one of `ENV_VARS_TRUE_VALUES = {"1", "ON", "YES"}`

I have been using `WANDB_DISABLED=true` everywhere in scripts as it was originally advertised. I have no idea why this was changed to a sub-set of possible values. And it's not documented anywhere.

@sgugger

* WANDB_DISABLED=true to disable; make tf trainer consistent

* style

24881008

Add head_mask and decoder_head_mask to FSMT (#9819) · 0c6c0afc

Daniel Stancl authored Feb 01, 2021

* Add {decoder_,}head_mask to fsmt_modeling.py

* Enable test_headmasking and some changes to docs

* Remove test_head_masking flag from fsmt test file

Remove test_head_masking flag from test_modeling_fsmt.py
since test_head_masking is set to be True by default (thus it is redundant to store).

* Merge master and remove test_head_masking = True

* Rebase necessary due to an update of jaxlib

* Remove test_head_masking=True in tests/test_modeling_fsmt.py
as it is redundant.

0c6c0afc

31 Jan, 2021 2 commits

TFBart lables consider both pad token and -100 (#9847) · 74f16b82

Kiyoung Kim authored Feb 01, 2021



* TFBart lables consider both pad token and -100

* make style

* fix for all other models

Co-authored-by: kykim <kykim>
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

74f16b82

Clarify definition of seed argument in TrainingArguments (#9903) · 22121e81

lewtun authored Jan 31, 2021



* Clarify definition of seed argument in Trainer

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/training_args_tf.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

22121e81

29 Jan, 2021 4 commits

refactor deepspeed setup devices (#9880) · 1420b5ff
Stas Bekman authored Jan 29, 2021

1420b5ff
When on sagemaker use their env variables for saves (#9876) · 7eadfe16
Sylvain Gugger authored Jan 29, 2021
```
* When on sagemaker use their env variables for saves

* Address review comments

* Quality
```
7eadfe16
Clarify use of unk_token in tokenizer docstrings (#9875) · 99b9affa
Ethan Chau authored Jan 29, 2021

99b9affa

Adding a new `return_full_text` parameter to TextGenerationPipeline. (#9852) · c2d0ffec

Nicolas Patry authored Jan 29, 2021

* Adding a new `return_full_text` parameter to TextGenerationPipeline.

For text-generation, it's sometimes used as prompting text.
In that context, prefixing `generated_text` with the actual input
forces the caller to take an extra step to remove it.

The proposed change adds a new parameter (for backward compatibility).
`return_full_text` that enables the caller to prevent adding the prefix.

* Doc quality.

c2d0ffec

28 Jan, 2021 6 commits

pin_memory -> dataloader_pin_memory (#9874) · bc109ae5
abhishek thakur authored Jan 28, 2021

bc109ae5
on_log event should occur *after* the current log is written (#9872) · 80e4184f
abhishek thakur authored Jan 28, 2021

80e4184f
Deprecate model_path in Trainer.train (#9854) · b4e559cf
Sylvain Gugger authored Jan 28, 2021

b4e559cf

Fix computation of attention_probs when head_mask is provided. (#9853) · 2ee9f9b6

Funtowicz Morgan authored Jan 28, 2021



* Fix computation of attention_probs when head_mask is provided.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Apply changes to the template
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

2ee9f9b6

Partial local tokenizer load (#9807) · 6cb0a6f0

Lysandre Debut authored Jan 28, 2021



* Allow partial loading of a cached tokenizer

* Warning > Info

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Raise error if not local_files_only
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

6cb0a6f0

Pin memory in Trainer by default (#9857) · 25fcb5c1

abhishek thakur authored Jan 28, 2021


Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

25fcb5c1

27 Jan, 2021 14 commits

ADD BORT (#9813) · 5ed5a546

Stefan Schweter authored Jan 27, 2021

* tests: add integration tests for new Bort model

* bort: add conversion script from Gluonnlp to Transformers 🚀



* bort: minor cleanup (BORT -> Bort)

* add docs

* make fix-copies

* clean doc a bit

* correct docs

* Update docs/source/model_doc/bort.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/bort.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* correct dialogpt doc

* correct link

* Update docs/source/model_doc/bort.rst

* Update docs/source/model_doc/dialogpt.rst
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5ed5a546

[traner] fix --lr_scheduler_type choices (#9800) · 7c6d6329

Stas Bekman authored Jan 27, 2021



* fix --lr_scheduler_type choices

* rewrite to fix for all enum-based cl args

* cleanup

* adjust test

* style

* Proposal that should work

* Remove needless code

* Fix test
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

7c6d6329

Allow --arg Value for booleans in HfArgumentParser (#9823) · 893120fa
Sylvain Gugger authored Jan 27, 2021
```
* Allow --arg Value for booleans in HfArgumentParser

* Update last test

* Better error message
```
893120fa

When resuming training from checkpoint, Trainer loads model (#9818) · 35d55b7b

Sylvain Gugger authored Jan 27, 2021

* Whenresuming training from checkpoint, Trainer loads model

* Finish cleaning tests

* Address review comment

* Use global_step from state

35d55b7b

Add tpu_zone and gcp_project in training_args_tf.py (#9825) · 20932e55
Kiyoung Kim authored Jan 27, 2021
```
* add tpu_zone and gcp_project in training_args_tf.py

* make style

Co-authored-by: kykim <kykim>
```
20932e55
Fix template (#9840) · bd701ab1
Julien Plu authored Jan 27, 2021

bd701ab1

Add a flag for find_unused_parameters (#9820) · c7b7bd99

Sylvain Gugger authored Jan 27, 2021



* Add a flag for find_unused_parameters

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Remove negation
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

c7b7bd99

Clean TF Bert (#9788) · 4adbdce5

Julien Plu authored Jan 27, 2021

* Start cleaning BERT

* Clean BERT and all those depends of it

* Fix attribute name

* Apply style

* Apply Sylvain's comments

* Apply Lysandre's comments

* remove unused import

4adbdce5

Delete a needless duplicate condition (#9826) · f0329ea5
tomohideshibata authored Jan 27, 2021
```
Co-authored-by: Tomohide Shibata <tomshiba@yahoo-corp.jp>
```
f0329ea5

Remove a TF usage warning and rework the documentation (#9756) · a1720694

Julien Plu authored Jan 27, 2021

* Rework documentation

* Update the template

* Trigger CI

* Restore the warning but with the TF logger

* Update convbert doc

a1720694

Adding a test to prevent late failure in the Table question answering (#9808) · 285c6262

Nicolas Patry authored Jan 27, 2021

pipeline.

- If table is empty then the line that contain `answer[0]` will fail.
- This PR add a check to prevent `answer[0]`.
- Also adds an early check for presence of `table` and `query` to
prevent late failure and give better error message.
- Adds a few tests to make sure these errors are correctly raised.

285c6262

fix typo with mt5 init (#9830) · a46050d0
Patrick von Platen authored Jan 27, 2021

a46050d0
Fix auto-resume training from checkpoint (#9822) · f4bf0dea
jncasey authored Jan 27, 2021
```
* Fix auto-resume training from checkpoint

* style fixes
```
f4bf0dea
[Setup.py] update jaxlib (#9831) · d5b40d66
Patrick von Platen authored Jan 27, 2021
```
* update jaxlib

* Update setup.py

* update table
```
d5b40d66