Commits · c108d0b5a43fee12e1ef578fe871f0f123b06018 · chenpangpang / transformers

"vscode:/vscode.git/clone" did not exist on "4e0f583eea0c2a460e39f0d2867c38d39e777056"

07 Dec, 2020 3 commits

transformers-cli: LFS multipart uploads (> 5GB) (#8663) · 28fa014a

Julien Chaumond authored Dec 07, 2020



* initial commit

* [cli] lfs commands

* Fix FileSlice

* Tweak to FileSlice

* [hf_api] Backport filetype arg from `datasets`

cc @lhoestq

* Silm down the CI while i'm working

* Ok let's try this in CI

* Update config.yml

* Do not try this at home

* one more try

* Update lfs.py

* Revert "Tweak to FileSlice"

This reverts commit d7e32c4b3500400486411e85a2b74e57fb6b52f5.

* Update test_hf_api.py

* Update test_hf_api.py

* Update test_hf_api.py

* CI still green?

* make CI green again?

* Update test_hf_api.py

* make CI red again?

* Update test_hf_api.py

* add CI style back

* Fix CI?

* oh my

* doc + switch back to real staging endpoint

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>

* Fix docblock + f-strings
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>

28fa014a

Add TFGPT2ForSequenceClassification based on DialogRPT (#8714) · 483e1327

sandip authored Dec 07, 2020

* Add TFGPT2ForSequenceClassification based on DialogRPT

* Add TFGPT2ForSequenceClassification based on DialogRPT

* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing

* Add TFGPT2ForSequenceClassification based on DialogRPT

* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing

* code refactor for latest other TF PR

* code refactor

* code refactor

* Update modeling_tf_gpt2.py

483e1327

Fix QA pipeline on Windows (#8947) · 28c77ddf
Sylvain Gugger authored Dec 07, 2020

28c77ddf

05 Dec, 2020 1 commit
- Fix typo for `modeling_bert` import resulting in ImportError (#8931) · ef93a254
  Machel Reid authored Dec 05, 2020
```
Self-explanatory ;) - Hope it helps!
```
  ef93a254
04 Dec, 2020 2 commits

Fix TF T5 only encoder model with booleans (#8925) · 71688a88
Lysandre Debut authored Dec 04, 2020

71688a88

Better booleans handling in the TF models (#8777) · dcd3046f

Julien Plu authored Dec 04, 2020

* Apply on BERT and ALBERT

* Update TF Bart

* Add input processing to TF BART

* Add input processing for TF CTRL

* Add input processing to TF Distilbert

* Add input processing to TF DPR

* Add input processing to TF Electra

* Add deprecated arguments

* Add input processing to TF XLM

* Add input processing to TF Funnel

* Add input processing to TF GPT2

* Add input processing to TF Longformer

* Add input processing to TF Lxmert

* Apply style

* Add input processing to TF Mobilebert

* Add input processing to TF GPT

* Add input processing to TF Roberta

* Add input processing to TF T5

* Add input processing to TF TransfoXL

* Apply style

* Rebase on master

* Bug fix

* Retry to bugfix

* Retry bug fix

* Fix wrong model name

* Try another fix

* Fix BART

* Fix input precessing

* Apply style

* Put the deprecated warnings in the input processing function

* Remove the unused imports

* Raise an error when len(kwargs)>0

* test ModelOutput instead of TFBaseModelOutput

* Bug fix

* Address Patrick's comments

* Address Patrick's comments

* Address Sylvain's comments

* Add boolean processing for the inputs

* Apply style

* Missing optional

* Fix missing some input proc

* Update the template

* Fix missing inputs

* Missing input

* Fix args parameter

* Trigger CI

* Trigger CI

* Trigger CI

* Address Patrick's and Sylvain's comments

* Replace warn by warning

* Trigger CI

* Fix XLNET

* Fix detection

dcd3046f

03 Dec, 2020 3 commits
- Fix move when the two cache folders exist (#8917) · 6ed7e32f
  Sylvain Gugger authored Dec 03, 2020
  
  6ed7e32f
- Avoid erasing the attention mask when double padding (#8915) · 8453201c
  Sylvain Gugger authored Dec 03, 2020
  
  8453201c
- Don't warn that models aren't available if Flax is available. (#8841) · 0deece9c
  Skye Wanderman-Milne authored Dec 03, 2020
  
  0deece9c
02 Dec, 2020 5 commits

[PyTorch] Refactor Resize Token Embeddings (#8880) · 443f67e8

Patrick von Platen authored Dec 02, 2020

* fix resize tokens

* correct mobile_bert

* move embedding fix into modeling_utils.py

* refactor

* fix lm head resize

* refactor

* break lines to make sylvain happy

* add news tests

* fix typo

* improve test

* skip bart-like for now

* check if base_model = get(...) is necessary

* clean files

* improve test

* fix tests

* revert style templates

* Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py

443f67e8

Fix typo in docstring (#8905) · 801b2cb3
ryota-mo authored Dec 03, 2020

801b2cb3

[trainer] improve code readability (#8903) · 7e1cb00c

Stas Bekman authored Dec 02, 2020

* [trainer] improve code

This PR:
- removes redundant code 
```
self.model = model if model is not None else None
```
and
```
self.model = model
```
are the same.

* separate attribute assignment from code logic - which simplifies things further.

* whitespace

7e1cb00c

Warning about too long input for fast tokenizers too (#8799) · a8c3f9aa

Nicolas Patry authored Dec 02, 2020

* Warning about too long input for fast tokenizers too

If truncation is not set in tokenizers, but the tokenization is too long
for the model (`model_max_length`), we used to trigger a warning that

The input would probably fail (which it most likely will).

This PR re-enables the warning for fast tokenizers too and uses common
code for the trigger to make sure it's consistent across.

* Checking for pair of inputs too.

* Making the function private and adding it's doc.

* Remove formatting ?? in odd place.

* Missed uppercase.

a8c3f9aa

Transfoxl seq classification (#8868) · f6b44e61
sandip authored Dec 02, 2020
```
* Transfoxl sequence classification

* Transfoxl sequence classification
```
f6b44e61

01 Dec, 2020 8 commits
- Add a `parallel_mode` property to TrainingArguments (#8877) · b08843cf
  Sylvain Gugger authored Dec 01, 2020
```
* Add a `distributed_env` property to TrainingArguments

* Change name

* Address comment
```
  b08843cf
- Better support for resuming training (#8878) · 7c10dd22
  Sylvain Gugger authored Dec 01, 2020
  
  7c10dd22
- Better warning when loading a tokenizer with AutoTokenizer w/o SnetencePiece (#8881) · a947386c
  Lysandre Debut authored Dec 01, 2020
  
  a947386c
- Prevent BatchEncoding from blindly passing casts down to the tensors it... · 9c18f156
  Adam Pocock authored Dec 01, 2020
```
Prevent BatchEncoding from blindly passing casts down to the tensors it contains. Fixes #6582. (#8860)

Update src/transformers/tokenization_utils_base.py with review fix
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
```
  9c18f156
- Make the big table creation/check platform independent (#8856) · c0df963e
  Sylvain Gugger authored Dec 01, 2020
  
  c0df963e
- 2 typos in modeling_rag.py (#8676) · d366228d
  Ratthachat (Jung) authored Dec 01, 2020
```
* 2 typos - from_question_encoder_generator_configs

fix 2 typos
from_encoder_generator_configs --> from_question_encoder_generator_configs

* apply make style
```
  d366228d
- Fix doc for language code (#8848) · 814b9550
  Rodolfo Quispe authored Dec 01, 2020
  
  814b9550
- Ctrl for sequence classification (#8812) · 4a9e502a
  elk-cloner authored Dec 01, 2020
```
* add CTRLForSequenceClassification

* pass local test

* merge with master

* fix modeling test for sequence classification

* fix deco

* fix assert
```
  4a9e502a
30 Nov, 2020 7 commits

NerPipeline (TokenClassification) now outputs offsets of words (#8781) · d8fc26e9

Nicolas Patry authored Nov 30, 2020

* NerPipeline (TokenClassification) now outputs offsets of words

- It happens that the offsets are missing, it forces the user to pattern
match the "word" from his input, which is not always feasible.
For instance if a sentence contains the same word twice, then there
is no way to know which is which.
- This PR proposes to fix that by outputting 2 new keys for this
pipelines outputs, "start" and "end", which correspond to the string
offsets of the word. That means that we should always have the
invariant:

```python
input[entity["start"]: entity["end"]] == entity["entity_group"]
                                    # or entity["entity"] if not grouped
```

* Fixing doc style

d8fc26e9

fix pypi complaint on version naming · 5fd3d81e
LysandreJik authored Nov 30, 2020

5fd3d81e
Release: v4.0.0 · 22b0ff75
LysandreJik authored Nov 30, 2020

22b0ff75

Remove deprecated `evalutate_during_training` (#8852) · 55302990

Sylvain Gugger authored Nov 30, 2020



* Remove deprecated `evalutate_during_training`

* Update src/transformers/training_args_tf.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

55302990

Use model.from_pretrained for DataParallel also (#8795) · 77384941

Shai Erera authored Nov 30, 2020

* Use model.from_pretrained for DataParallel also

When training on multiple GPUs, the code wraps a model with torch.nn.DataParallel. However if the model has custom from_pretrained logic, it does not get applied during load_best_model_at_end.

This commit uses the underlying model during load_best_model_at_end, and re-wraps the loaded model with DataParallel.

If you choose to reject this change, then could you please move the this logic to a function, e.g. def load_best_model_checkpoint(best_model_checkpoint) or something, so that it can be overridden?

* Fix silly bug

* Address review comments

Thanks for the feedback. I made the change that you proposed, but I also think we should update L811 to check if `self.mode` is an instance of `PreTrained`, otherwise we would still not get into that `if` section, right?

77384941

Correct docstring. (#8845) · cc983cd9
Fraser Greenlee authored Nov 30, 2020
```
Related issue: https://github.com/huggingface/transformers/issues/8837
```
cc983cd9

Add T5 Encoder for Feature Extraction (#8717) · 40ecaf0c

Ahmed Elnaggar authored Nov 30, 2020



* Add T5 Encoder class for feature extraction

* fix T5 encoder add_start_docstrings indent

* update init with T5 encoder

* update init with TFT5ModelEncoder

* remove TFT5ModelEncoder

* change T5ModelEncoder order in init

* add T5ModelEncoder to transformers init

* clean T5ModelEncoder

* update init with TFT5ModelEncoder

* add TFModelEncoder for Tensorflow

* update init with TFT5ModelEncoder

* Update src/transformers/models/t5/modeling_t5.py

change output from Seq2SeqModelOutput to BaseModelOutput
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* remove encoder_outputs

1. remove encoder_outputs from the function call.
2. remove the encoder_outputs If statement.
3. remove isinstance from return_dict.

* Authorize missing decoder keys

* remove unnecessary input parameters

remove pask_key_values and use_cache

* remove use_cache

remove use_cache from the forward method

* add doctoring for T5 encoder

add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING

* change return_dict to dot access

* add T5_ENCODER_INPUTS_DOCSTRING for TF T5

* change TFT5Encoder output type to BaseModelOutput

* remove unnecessary parameters for TFT5Encoder

* remove unnecessary if statement

* add import BaseModelOutput

* fix BaseModelOutput typo to TFBaseModelOutput

* update T5 doc with T5ModelEncoder

* add T5ModelEncoder to tests

* finish pytorch

* finish docs and mt5

* add mtf to init

* fix init

* remove n_positions

* finish PR

* Update src/transformers/models/mt5/modeling_mt5.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/t5/modeling_t5.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/t5/modeling_tf_t5.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/mt5/modeling_tf_mt5.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

40ecaf0c

29 Nov, 2020 2 commits

Minor docs typo fixes (#8797) · 3a08cc1c

Guy Rosin authored Nov 29, 2020



* Fix minor typos

* Additional typos

* Style fix
Co-authored-by: guyrosin <guyrosin@assist-561.cs.technion.ac.il>

3a08cc1c

[Pegasus] Refactor Tokenizer (#8731) · 5ced23dc

Patrick von Platen authored Nov 29, 2020

* refactor

* further refactor

* fix the rest tomorrow

* save intermediate

* finish slow tokenizer

* make more tests pass

* finish refactor

* fix comment

* clean further

* fix name

* fix naming

* Update src/transformers/models/reformer/tokenization_reformer.py

* Apply suggestions from code review

* Apply suggestions from code review

* refactor

* fix init tokenizers

* refactor

* improve convert

* refactor

* correct convert slow tokenizer

* final fix for Pegasus Tok

* remove ipdb

* improve links

5ced23dc

28 Nov, 2020 1 commit
- fix mt5 config (#8832) · 36b60ce9
  Patrick von Platen authored Nov 28, 2020
  
  36b60ce9
27 Nov, 2020 5 commits

BART & FSMT: fix decoder not returning hidden states from the last layer (#8597) · 0a921b64

Max Del authored Nov 27, 2020



* Fix decoder not returning hidden states from the last layer

* Resolve conflict

* Change the way to gather hidden states

* Add decoder hidden states test

* Make pytest and black happy

* Remove redundant line

* remove new line
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

0a921b64

Add barthez model (#8393) · 81fe0bf0

Moussa Kamal Eddine authored Nov 27, 2020



* Add init barthez

* Add barthez model, tokenizer and docs

BARThez is a pre-trained french seq2seq model that uses BART objective.

* Apply suggestions from code review docs typos
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add license

* Change URLs scheme

* Remove barthez model keep tokenizer

* Fix style

* Fix quality

* Update tokenizer

* Add fast tokenizer

* Add fast tokenizer test
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

81fe0bf0

Extend typing to path-like objects in `PretrainedConfig` and `PreTrainedModel` (#8770) · f9a2a9e3

Giovanni Compagnoni authored Nov 27, 2020

* update configuration_utils.py typing to allow pathlike objects when sensible

* update modeling_utils.py typing to allow pathlike objects when sensible

* black

* update tokenization_utils_base.py typing to allow pathlike objects when sensible

* update tokenization_utils_fast.py typing to allow pathlike objects when sensible

* update configuration_auto.py typing to allow pathlike objects when sensible

* update configuration_auto.py docstring to allow pathlike objects when sensible

* update tokenization_auto.py docstring to allow pathlike objects when sensible

* black

f9a2a9e3

Fix dpr<>bart config for RAG (#8808) · a7d46a06

Patrick von Platen authored Nov 27, 2020

* correct dpr test and bert pos fault

* fix dpr bert config problem

* fix layoutlm

* add config to dpr as well

a7d46a06

[FlaxBert] Fix non-broadcastable attention mask for batched forward-passes (#8791) · f8eda599

Kristian Holsheimer authored Nov 27, 2020

* [FlaxBert] Fix non-broadcastable attention mask for batched forward-passes

* [FlaxRoberta] Fix non-broadcastable attention mask

* Use jax.numpy instead of ordinary numpy (otherwise not jit-able)

* Partially revert "Use jax.numpy ..."

* Add tests for batched forward passes

* Avoid unnecessary OOMs due to preallocation of GPU memory by XLA

* Auto-fix style

* Re-enable GPU memory preallocation but with mem fraction < 1/paralleism

f8eda599

25 Nov, 2020 3 commits

[XLNet] Fix mems behavior (#8567) · 2a6fbe6a

Patrick von Platen authored Nov 25, 2020

* fix mems in xlnet

* fix use_mems

* fix use_mem_len

* fix use mems

* clean docs

* fix tf typo

* make xlnet tf for generation work

* fix tf test

* refactor use cache

* add use cache for missing models

* correct use_cache in generate

* correct use cache in tf generate

* fix tf

* correct getattr typo

* make sylvain happy

* change in docs as well

* do not apply to cookie cutter statements

* fix tf test

* make pytorch model fully backward compatible

2a6fbe6a

Return correct Bart hidden state tensors (#8747) · 369f1d77

Joe Davison authored Nov 25, 2020



* bart output hidden states upstream

* same w/ decoder

* add tests

* fix prophetnet

* fix gpt2 and ctrl

* fix fstm and skip test for reformer and longformer

* fix all models
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

369f1d77

Fix QA argument handler (#8765) · 138f45c1

Lysandre Debut authored Nov 25, 2020



* Fix QA argument handler

* Attempt to get a better fix for QA (#8768)
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

138f45c1