Commits · a1a67a3ced35531e871026a01cfff653105d532f · chenpangpang / transformers

"utils/vscode:/vscode.git/clone" did not exist on "45f56580a7e11b5b894374f8e1c7bdd54d982682"

03 Feb, 2021 1 commit
- Fix GroupedLinearLayer in TF ConvBERT (#9972) · a1a67a3c
  abhishek thakur authored Feb 03, 2021
  
  a1a67a3c
02 Feb, 2021 9 commits

Add head_mask and decoder_head_mask to PyTorch LED (#9856) · 71bdc076

Daniel Stancl authored Feb 02, 2021

* Add {decoder_,}head_mask to LED

* Fix create_custom_forward signatue in encoder

* Add head_mask to longformer

* Add head_mask to longformer to fix dependencies
of LED on Longformer.

* Not working yet

* Add mising one input in longofrmer_modeling.py

* make fix-copies

71bdc076

Wav2Vec2 (#9659) · d6217fb3

Patrick von Platen authored Feb 02, 2021



* add raw scaffold

* implement feat extract layers

* make style

* remove +

* correctly convert weights

* make feat extractor work

* make feature extraction proj work

* run forward pass

* finish forward pass

* Succesful decoding example

* remove unused files

* more changes

* add wav2vec tokenizer

* add new structure

* fix run forward

* add other layer norm architecture

* finish 2nd structure

* add model tests

* finish tests for tok and model

* clean-up

* make style

* finish docstring for model and config

* make style

* correct docstring

* correct tests

* change checkpoints to fairseq

* fix examples

* finish wav2vec2

* make style

* apply sylvains suggestions

* apply lysandres suggestions

* change print to log.info

* re-add assert statement

* add input_values as required input name

* finish wav2vec2 tokenizer

* Update tests/test_tokenization_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* apply sylvains suggestions
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

d6217fb3

Use compute_loss in prediction_step (#9935) · d996024a
Sylvain Gugger authored Feb 02, 2021

d996024a
convbert: minor fixes for conversion script (#9937) · aa438a42
Stefan Schweter authored Feb 02, 2021

aa438a42
Bump numpy (#9934) · 62024453
Sylvain Gugger authored Feb 02, 2021

62024453
Fix 9918 (#9932) · de38a6e4
Sylvain Gugger authored Feb 02, 2021
```
* Initial work

* Fix doc styler and other models
```
de38a6e4
ALBERT Tokenizer integration test (#9943) · 1809de51
Lysandre Debut authored Feb 02, 2021
```
* ALBERT Tokenizer integration test

* Batching

* Style
```
1809de51
fix typo in naming (#9944) · 0f4dc5d8
Patrick von Platen authored Feb 02, 2021

0f4dc5d8

[Tokenizer Utils Base] Make pad function more flexible (#9928) · 538b3b46

Patrick von Platen authored Feb 02, 2021

* change tokenizer requirement

* split line

* Correct typo from list to str

* improve style

* make other function pretty as well

* add comment

* correct typo

* add new test

* pass tests for tok without padding token

* Apply suggestions from code review

538b3b46

01 Feb, 2021 11 commits

Tensorflow doc changes on loss output size (#9922) · d1b14c9b

Jan Jitse Venselaar authored Feb 01, 2021

* Change documentation to correctly specify loss tensor size

* Change documentation to correct input format for labels

* Corrected output size of loss tensor for sequence classifier, multiple choice model and question answering

d1b14c9b

Fix bart conversion script (#9923) · 343057e1
Suraj Patil authored Feb 01, 2021
```
* fix conversion script

* typo

* import nn
```
343057e1

Add new model docs (#9667) · 0e3be1ac

Patrick von Platen authored Feb 01, 2021



* add new model logic

* fix docs

* change structure

* improve add_new_model

* push new changes

* up

* up

* correct spelling

* improve docstring

* correct line length

* update readme

* correct links

* correct typos

* only add rst file for now

* Apply suggestions from code review 1
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>

* Apply suggestions from code review
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>

* finish adding all suggestions

* make style

* apply Niels feedback

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply sylvains suggestions
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0e3be1ac

fix typos (#9924) · 0842c33e
Suraj Patil authored Feb 01, 2021

0842c33e

Adafactor: avoid updating group["lr"] attributes (#9751) · 8672bcda

CeShine Lee authored Feb 01, 2021

This affects Adafactor with relative_step=False and scale_parameter=True.
Updating group["lr"] makes the result of ._get_lr() depends on the previous call,
i.e., on the scale of other parameters. This isn't supposed to happen.

8672bcda

Remove subclass for sortish sampler (#9907) · 115d97dd
Sylvain Gugger authored Feb 01, 2021
```
* Remove subclass for sortish sampler

* Use old Seq2SeqTrainer in script

* Styling
```
115d97dd

Fit chinese wwm to new datasets (#9887) · 1682804e

wlhgtc authored Feb 01, 2021



* MOD: fit chinese wwm to new datasets

* MOD: move wwm to new folder

* MOD: formate code

* Styling

* MOD add param and recover trainer
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

1682804e

[wandb] restore WANDB_DISABLED=true to disable wandb (#9896) · 24881008

Stas Bekman authored Feb 01, 2021

* [t5 doc] typos

a few run away backticks

@sgugger

* style

* [trainer] put fp16 args together

this PR proposes a purely cosmetic change that puts all the fp16 args together - so they are easier to manager/read

@sgugger

* style

* [wandb] make WANDB_DISABLED disable wandb with any value

This PR solves part of https://github.com/huggingface/transformers/issues/9623

It tries to actually do what https://github.com/huggingface/transformers/issues/9699 requested/discussed and that is any value of `WANDB_DISABLED` should disable wandb.

The current behavior is that it has to be one of `ENV_VARS_TRUE_VALUES = {"1", "ON", "YES"}`

I have been using `WANDB_DISABLED=true` everywhere in scripts as it was originally advertised. I have no idea why this was changed to a sub-set of possible values. And it's not documented anywhere.

@sgugger

* WANDB_DISABLED=true to disable; make tf trainer consistent

* style

24881008

fix logger format for non-main process (#9911) · 6bab8368
Stas Bekman authored Feb 01, 2021

6bab8368
Doc title in the template (#9910) · d85691ac
Sylvain Gugger authored Feb 01, 2021

d85691ac

Add head_mask and decoder_head_mask to FSMT (#9819) · 0c6c0afc

Daniel Stancl authored Feb 01, 2021

* Add {decoder_,}head_mask to fsmt_modeling.py

* Enable test_headmasking and some changes to docs

* Remove test_head_masking flag from fsmt test file

Remove test_head_masking flag from test_modeling_fsmt.py
since test_head_masking is set to be True by default (thus it is redundant to store).

* Merge master and remove test_head_masking = True

* Rebase necessary due to an update of jaxlib

* Remove test_head_masking=True in tests/test_modeling_fsmt.py
as it is redundant.

0c6c0afc

31 Jan, 2021 2 commits

TFBart lables consider both pad token and -100 (#9847) · 74f16b82

Kiyoung Kim authored Feb 01, 2021



* TFBart lables consider both pad token and -100

* make style

* fix for all other models

Co-authored-by: kykim <kykim>
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

74f16b82

Clarify definition of seed argument in TrainingArguments (#9903) · 22121e81

lewtun authored Jan 31, 2021



* Clarify definition of seed argument in Trainer

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/training_args_tf.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

22121e81

30 Jan, 2021 1 commit

[doc] nested markup is invalid in rst (#9898) · 40cfc355

Stas Bekman authored Jan 30, 2021

Apparently nested markup in RST is invalid: https://docutils.sourceforge.io/FAQ.html#is-nested-inline-markup-possible

So currently this line doesn't get rendered properly, leaving inner markdown unrendered, resulting in:
```
https://docutils.sourceforge.io/FAQ.html#is-nested-inline-markup-possible
```

This PR removes the bold which fixes the link.

40cfc355

29 Jan, 2021 6 commits
- refactor deepspeed setup devices (#9880) · 1420b5ff
  Stas Bekman authored Jan 29, 2021
  
  1420b5ff
- correctly handle mt5 (#9879) · 6bf94bc0
  Stas Bekman authored Jan 29, 2021
  
  6bf94bc0
- When on sagemaker use their env variables for saves (#9876) · 7eadfe16
  Sylvain Gugger authored Jan 29, 2021
```
* When on sagemaker use their env variables for saves

* Address review comments

* Quality
```
  7eadfe16
- Add XLA test (#9848) · fdcde144
  Julien Plu authored Jan 29, 2021
  
  fdcde144
- Clarify use of unk_token in tokenizer docstrings (#9875) · 99b9affa
  Ethan Chau authored Jan 29, 2021
  
  99b9affa
- Adding a new `return_full_text` parameter to TextGenerationPipeline. (#9852) · c2d0ffec
  Nicolas Patry authored Jan 29, 2021
```
* Adding a new `return_full_text` parameter to TextGenerationPipeline.

For text-generation, it's sometimes used as prompting text.
In that context, prefixing `generated_text` with the actual input
forces the caller to take an extra step to remove it.

The proposed change adds a new parameter (for backward compatibility).
`return_full_text` that enables the caller to prevent adding the prefix.

* Doc quality.
```
  c2d0ffec
28 Jan, 2021 10 commits

pin_memory -> dataloader_pin_memory (#9874) · bc109ae5
abhishek thakur authored Jan 28, 2021

bc109ae5
on_log event should occur *after* the current log is written (#9872) · 80e4184f
abhishek thakur authored Jan 28, 2021

80e4184f

[docs] expand install instructions (#9817) · 15e4ce35

Stas Bekman authored Jan 28, 2021



* expand install instructions

* fix

* white space

* rewrite as discussed in the PR

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change the wording to encourage issue report
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

15e4ce35

Remove redundant `test_head_masking = True` flags in test files (#9858) · 4c3ae89a

Daniel Stancl authored Jan 28, 2021

* Remove redundant test_head_masking = True flags

* Remove all redundant test_head_masking flags in PyTorch test_modeling_* files

* Make test_head_masking = True as a default choice in test_modeling_tf_commong.py

* Remove all redundant test_head_masking flags in TensorFlow
test_modeling_tf_* files

* Put back test_head_masking=False fot TFT5 models

4c3ae89a

tutorial typo · caddf912
Joe Davison authored Jan 28, 2021

caddf912
Deprecate model_path in Trainer.train (#9854) · b4e559cf
Sylvain Gugger authored Jan 28, 2021

b4e559cf

Fix computation of attention_probs when head_mask is provided. (#9853) · 2ee9f9b6

Funtowicz Morgan authored Jan 28, 2021



* Fix computation of attention_probs when head_mask is provided.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Apply changes to the template
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

2ee9f9b6

Fixing flaky conversational test + flag it as a pipeline test. (#9837) · b936582f
Nicolas Patry authored Jan 28, 2021

b936582f
Remove submodule (#9868) · 58fbef9e
Lysandre Debut authored Jan 28, 2021

58fbef9e

Partial local tokenizer load (#9807) · 6cb0a6f0

Lysandre Debut authored Jan 28, 2021



* Allow partial loading of a cached tokenizer

* Warning > Info

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Raise error if not local_files_only
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

6cb0a6f0