Commits · 71688a8889c4df7dd6d90a65d895ccf4e33a1a56 · chenpangpang / transformers

04 Dec, 2020 3 commits

Fix TF T5 only encoder model with booleans (#8925) · 71688a88
Lysandre Debut authored Dec 04, 2020

71688a88

Better booleans handling in the TF models (#8777) · dcd3046f

Julien Plu authored Dec 04, 2020

* Apply on BERT and ALBERT

* Update TF Bart

* Add input processing to TF BART

* Add input processing for TF CTRL

* Add input processing to TF Distilbert

* Add input processing to TF DPR

* Add input processing to TF Electra

* Add deprecated arguments

* Add input processing to TF XLM

* Add input processing to TF Funnel

* Add input processing to TF GPT2

* Add input processing to TF Longformer

* Add input processing to TF Lxmert

* Apply style

* Add input processing to TF Mobilebert

* Add input processing to TF GPT

* Add input processing to TF Roberta

* Add input processing to TF T5

* Add input processing to TF TransfoXL

* Apply style

* Rebase on master

* Bug fix

* Retry to bugfix

* Retry bug fix

* Fix wrong model name

* Try another fix

* Fix BART

* Fix input precessing

* Apply style

* Put the deprecated warnings in the input processing function

* Remove the unused imports

* Raise an error when len(kwargs)>0

* test ModelOutput instead of TFBaseModelOutput

* Bug fix

* Address Patrick's comments

* Address Patrick's comments

* Address Sylvain's comments

* Add boolean processing for the inputs

* Apply style

* Missing optional

* Fix missing some input proc

* Update the template

* Fix missing inputs

* Missing input

* Fix args parameter

* Trigger CI

* Trigger CI

* Trigger CI

* Address Patrick's and Sylvain's comments

* Replace warn by warning

* Trigger CI

* Fix XLNET

* Fix detection

dcd3046f

[s2s finetune_trainer] add instructions for distributed training (#8884) · 4c3d98dd
Stas Bekman authored Dec 03, 2020

4c3d98dd

03 Dec, 2020 7 commits

Patch model parallel test (#8920) · aa60b230
Lysandre Debut authored Dec 03, 2020
```
* Patch model parallel test

* Remove line

* Remove `ci_*` from scheduled branches
```
aa60b230

Put Transformers on Conda (#8918) · 0c5615af

Lysandre Debut authored Dec 03, 2020



* conda

* Guide

* correct tag

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/installation.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Sylvain's comments
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0c5615af

Tweak wording + Add badge w/ number of models on the hub (#8914) · 9ad61943

Julien Chaumond authored Dec 03, 2020

* Add badge w/ number of models on the hub

* try to apease @sgugger 😇



* not sure what this `c` was about [ci skip]

* Fix script and move stuff around

* Fix doc styling error
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

9ad61943

Fix move when the two cache folders exist (#8917) · 6ed7e32f
Sylvain Gugger authored Dec 03, 2020

6ed7e32f
Avoid erasing the attention mask when double padding (#8915) · 8453201c
Sylvain Gugger authored Dec 03, 2020

8453201c
Don't warn that models aren't available if Flax is available. (#8841) · 0deece9c
Skye Wanderman-Milne authored Dec 03, 2020

0deece9c
[model_cards] lm-head was deprecated · 2b7fc9a0
Julien Chaumond authored Dec 03, 2020
```
(and wasn't needed here anyways as it was added automatically)
```
2b7fc9a0

02 Dec, 2020 7 commits

[PyTorch] Refactor Resize Token Embeddings (#8880) · 443f67e8

Patrick von Platen authored Dec 02, 2020

* fix resize tokens

* correct mobile_bert

* move embedding fix into modeling_utils.py

* refactor

* fix lm head resize

* refactor

* break lines to make sylvain happy

* add news tests

* fix typo

* improve test

* skip bart-like for now

* check if base_model = get(...) is necessary

* clean files

* improve test

* fix tests

* revert style templates

* Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py

443f67e8

Update README.md (#8906) · e52f9c0a
Devangi Purkayastha authored Dec 02, 2020

e52f9c0a
Fix typo in docstring (#8905) · 801b2cb3
ryota-mo authored Dec 03, 2020

801b2cb3

[trainer] improve code readability (#8903) · 7e1cb00c

Stas Bekman authored Dec 02, 2020

* [trainer] improve code

This PR:
- removes redundant code 
```
self.model = model if model is not None else None
```
and
```
self.model = model
```
are the same.

* separate attribute assignment from code logic - which simplifies things further.

* whitespace

7e1cb00c

Warning about too long input for fast tokenizers too (#8799) · a8c3f9aa

Nicolas Patry authored Dec 02, 2020

* Warning about too long input for fast tokenizers too

If truncation is not set in tokenizers, but the tokenization is too long
for the model (`model_max_length`), we used to trigger a warning that

The input would probably fail (which it most likely will).

This PR re-enables the warning for fast tokenizers too and uses common
code for the trigger to make sure it's consistent across.

* Checking for pair of inputs too.

* Making the function private and adding it's doc.

* Remove formatting ?? in odd place.

* Missed uppercase.

a8c3f9aa

Transfoxl seq classification (#8868) · f6b44e61
sandip authored Dec 02, 2020
```
* Transfoxl sequence classification

* Transfoxl sequence classification
```
f6b44e61

[ci] skip doc jobs take #3 (#8885) · 24f0c2fe

Stas Bekman authored Dec 02, 2020

* check that we get any match first

* docs only

* 2 docs only

* add code

* restore

24f0c2fe

01 Dec, 2020 11 commits

disable job skip - need more work · 693ac359

Stas Bekman authored Dec 01, 2020

reference: https://github.com/huggingface/transformers/pull/8853#issuecomment-736779863

693ac359

start using training_args.parallel_mode (#8882) · 379005c9
Stas Bekman authored Dec 01, 2020

379005c9
Add a `parallel_mode` property to TrainingArguments (#8877) · b08843cf
Sylvain Gugger authored Dec 01, 2020
```
* Add a `distributed_env` property to TrainingArguments

* Change name

* Address comment
```
b08843cf
Better support for resuming training (#8878) · 7c10dd22
Sylvain Gugger authored Dec 01, 2020

7c10dd22

[CI] skip docs-only jobs take #2 (#8853) · 21db560d

Stas Bekman authored Dec 01, 2020

* restore skip

* Revert "Remove deprecated `evalutate_during_training` (#8852)"

This reverts commit 55302990.

* check that pipeline.git.base_revision is defined before proceeding

* Revert "Revert "Remove deprecated `evalutate_during_training` (#8852)""

This reverts commit dfec84db3fdce1079f01f1bc8dfaf21db2ccaba1.

* check that pipeline.git.base_revision is defined before proceeding

* doc only

* doc + code

* restore

* restore

* typo

21db560d

Better warning when loading a tokenizer with AutoTokenizer w/o SnetencePiece (#8881) · a947386c
Lysandre Debut authored Dec 01, 2020

a947386c

Prevent BatchEncoding from blindly passing casts down to the tensors it... · 9c18f156

Adam Pocock authored Dec 01, 2020

Prevent BatchEncoding from blindly passing casts down to the tensors it contains. Fixes #6582. (#8860)

Update src/transformers/tokenization_utils_base.py with review fix
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

9c18f156

Make the big table creation/check platform independent (#8856) · c0df963e
Sylvain Gugger authored Dec 01, 2020

c0df963e

2 typos in modeling_rag.py (#8676) · d366228d

Ratthachat (Jung) authored Dec 01, 2020

* 2 typos - from_question_encoder_generator_configs

fix 2 typos
from_encoder_generator_configs --> from_question_encoder_generator_configs

* apply make style

d366228d

Fix doc for language code (#8848) · 814b9550
Rodolfo Quispe authored Dec 01, 2020

814b9550

Ctrl for sequence classification (#8812) · 4a9e502a

elk-cloner authored Dec 01, 2020

* add CTRLForSequenceClassification

* pass local test

* merge with master

* fix modeling test for sequence classification

* fix deco

* fix assert

4a9e502a

30 Nov, 2020 12 commits

[s2s trainer] fix DP mode (#8823) · 7f34d757

Stas Bekman authored Nov 30, 2020

* fix DP case on multi-gpu

* make executable

* test all 3 modes

* use the correct check for distributed

* dp doesn't need a special case

* restore original name

* cleanup

7f34d757

NerPipeline (TokenClassification) now outputs offsets of words (#8781) · d8fc26e9

Nicolas Patry authored Nov 30, 2020

* NerPipeline (TokenClassification) now outputs offsets of words

- It happens that the offsets are missing, it forces the user to pattern
match the "word" from his input, which is not always feasible.
For instance if a sentence contains the same word twice, then there
is no way to know which is which.
- This PR proposes to fix that by outputting 2 new keys for this
pipelines outputs, "start" and "end", which correspond to the string
offsets of the word. That means that we should always have the
invariant:

```python
input[entity["start"]: entity["end"]] == entity["entity_group"]
                                    # or entity["entity"] if not grouped
```

* Fixing doc style

d8fc26e9

fix pypi complaint on version naming · 5fd3d81e
LysandreJik authored Nov 30, 2020

5fd3d81e

Attempt to fix Flax CI error(s) (#8829) · 51b07131

Funtowicz Morgan authored Nov 30, 2020



* Slightly increase tolerance between pytorch and flax output
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* test_multiple_sentences doesn't require torch
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Simplify parameterization on "jit" to use boolean rather than str
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Use `require_torch` on `test_multiple_sentences` because we pull the weight from the hub.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Rename "jit" parameter to "use_jit" for (hopefully) making it self-documenting.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Remove pytest.mark.parametrize which seems to fail in some circumstances
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix unused imports.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Give default parameters values for traced model.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Review comment: Change sentences to sequences
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

51b07131

Update docs · 9995a341
LysandreJik authored Nov 30, 2020

9995a341
Release: v4.0.0 · 22b0ff75
LysandreJik authored Nov 30, 2020

22b0ff75

Remove deprecated `evalutate_during_training` (#8852) · 55302990

Sylvain Gugger authored Nov 30, 2020



* Remove deprecated `evalutate_during_training`

* Update src/transformers/training_args_tf.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

55302990

Use model.from_pretrained for DataParallel also (#8795) · 77384941

Shai Erera authored Nov 30, 2020

* Use model.from_pretrained for DataParallel also

When training on multiple GPUs, the code wraps a model with torch.nn.DataParallel. However if the model has custom from_pretrained logic, it does not get applied during load_best_model_at_end.

This commit uses the underlying model during load_best_model_at_end, and re-wraps the loaded model with DataParallel.

If you choose to reject this change, then could you please move the this logic to a function, e.g. def load_best_model_checkpoint(best_model_checkpoint) or something, so that it can be overridden?

* Fix silly bug

* Address review comments

Thanks for the feedback. I made the change that you proposed, but I also think we should update L811 to check if `self.mode` is an instance of `PreTrained`, otherwise we would still not get into that `if` section, right?

77384941

Merge remote-tracking branch 'origin/master' · 4062c75e
Sylvain Gugger authored Nov 30, 2020

4062c75e
Comment the skip job on doc line · 08e70763
Sylvain Gugger authored Nov 30, 2020

08e70763
Add a direct link to the big table (#8850) · 75f8100f
Sylvain Gugger authored Nov 30, 2020

75f8100f
Correct docstring. (#8845) · cc983cd9
Fraser Greenlee authored Nov 30, 2020
```
Related issue: https://github.com/huggingface/transformers/issues/8837
```
cc983cd9