Commits · c108d0b5a43fee12e1ef578fe871f0f123b06018 · chenpangpang / transformers

07 Dec, 2020 2 commits

transformers-cli: LFS multipart uploads (> 5GB) (#8663) · 28fa014a

Julien Chaumond authored Dec 07, 2020



* initial commit

* [cli] lfs commands

* Fix FileSlice

* Tweak to FileSlice

* [hf_api] Backport filetype arg from `datasets`

cc @lhoestq

* Silm down the CI while i'm working

* Ok let's try this in CI

* Update config.yml

* Do not try this at home

* one more try

* Update lfs.py

* Revert "Tweak to FileSlice"

This reverts commit d7e32c4b3500400486411e85a2b74e57fb6b52f5.

* Update test_hf_api.py

* Update test_hf_api.py

* Update test_hf_api.py

* CI still green?

* make CI green again?

* Update test_hf_api.py

* make CI red again?

* Update test_hf_api.py

* add CI style back

* Fix CI?

* oh my

* doc + switch back to real staging endpoint

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>

* Fix docblock + f-strings
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>

28fa014a

Add TFGPT2ForSequenceClassification based on DialogRPT (#8714) · 483e1327

sandip authored Dec 07, 2020

* Add TFGPT2ForSequenceClassification based on DialogRPT

* Add TFGPT2ForSequenceClassification based on DialogRPT

* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing

* Add TFGPT2ForSequenceClassification based on DialogRPT

* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing

* code refactor for latest other TF PR

* code refactor

* code refactor

* Update modeling_tf_gpt2.py

483e1327

03 Dec, 2020 1 commit
- Patch model parallel test (#8920) · aa60b230
  Lysandre Debut authored Dec 03, 2020
```
* Patch model parallel test

* Remove line

* Remove `ci_*` from scheduled branches
```
  aa60b230
02 Dec, 2020 3 commits

[PyTorch] Refactor Resize Token Embeddings (#8880) · 443f67e8

Patrick von Platen authored Dec 02, 2020

* fix resize tokens

* correct mobile_bert

* move embedding fix into modeling_utils.py

* refactor

* fix lm head resize

* refactor

* break lines to make sylvain happy

* add news tests

* fix typo

* improve test

* skip bart-like for now

* check if base_model = get(...) is necessary

* clean files

* improve test

* fix tests

* revert style templates

* Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py

443f67e8

Warning about too long input for fast tokenizers too (#8799) · a8c3f9aa

Nicolas Patry authored Dec 02, 2020

* Warning about too long input for fast tokenizers too

If truncation is not set in tokenizers, but the tokenization is too long
for the model (`model_max_length`), we used to trigger a warning that

The input would probably fail (which it most likely will).

This PR re-enables the warning for fast tokenizers too and uses common
code for the trigger to make sure it's consistent across.

* Checking for pair of inputs too.

* Making the function private and adding it's doc.

* Remove formatting ?? in odd place.

* Missed uppercase.

a8c3f9aa

Transfoxl seq classification (#8868) · f6b44e61
sandip authored Dec 02, 2020
```
* Transfoxl sequence classification

* Transfoxl sequence classification
```
f6b44e61

01 Dec, 2020 2 commits
- Better support for resuming training (#8878) · 7c10dd22
  Sylvain Gugger authored Dec 01, 2020
  
  7c10dd22
- Ctrl for sequence classification (#8812) · 4a9e502a
  elk-cloner authored Dec 01, 2020
```
* add CTRLForSequenceClassification

* pass local test

* merge with master

* fix modeling test for sequence classification

* fix deco

* fix assert
```
  4a9e502a
30 Nov, 2020 3 commits

NerPipeline (TokenClassification) now outputs offsets of words (#8781) · d8fc26e9

Nicolas Patry authored Nov 30, 2020

* NerPipeline (TokenClassification) now outputs offsets of words

- It happens that the offsets are missing, it forces the user to pattern
match the "word" from his input, which is not always feasible.
For instance if a sentence contains the same word twice, then there
is no way to know which is which.
- This PR proposes to fix that by outputting 2 new keys for this
pipelines outputs, "start" and "end", which correspond to the string
offsets of the word. That means that we should always have the
invariant:

```python
input[entity["start"]: entity["end"]] == entity["entity_group"]
                                    # or entity["entity"] if not grouped
```

* Fixing doc style

d8fc26e9

Attempt to fix Flax CI error(s) (#8829) · 51b07131

Funtowicz Morgan authored Nov 30, 2020



* Slightly increase tolerance between pytorch and flax output
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* test_multiple_sentences doesn't require torch
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Simplify parameterization on "jit" to use boolean rather than str
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Use `require_torch` on `test_multiple_sentences` because we pull the weight from the hub.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Rename "jit" parameter to "use_jit" for (hopefully) making it self-documenting.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Remove pytest.mark.parametrize which seems to fail in some circumstances
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix unused imports.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Give default parameters values for traced model.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Review comment: Change sentences to sequences
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

51b07131

Add T5 Encoder for Feature Extraction (#8717) · 40ecaf0c

Ahmed Elnaggar authored Nov 30, 2020



* Add T5 Encoder class for feature extraction

* fix T5 encoder add_start_docstrings indent

* update init with T5 encoder

* update init with TFT5ModelEncoder

* remove TFT5ModelEncoder

* change T5ModelEncoder order in init

* add T5ModelEncoder to transformers init

* clean T5ModelEncoder

* update init with TFT5ModelEncoder

* add TFModelEncoder for Tensorflow

* update init with TFT5ModelEncoder

* Update src/transformers/models/t5/modeling_t5.py

change output from Seq2SeqModelOutput to BaseModelOutput
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* remove encoder_outputs

1. remove encoder_outputs from the function call.
2. remove the encoder_outputs If statement.
3. remove isinstance from return_dict.

* Authorize missing decoder keys

* remove unnecessary input parameters

remove pask_key_values and use_cache

* remove use_cache

remove use_cache from the forward method

* add doctoring for T5 encoder

add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING

* change return_dict to dot access

* add T5_ENCODER_INPUTS_DOCSTRING for TF T5

* change TFT5Encoder output type to BaseModelOutput

* remove unnecessary parameters for TFT5Encoder

* remove unnecessary if statement

* add import BaseModelOutput

* fix BaseModelOutput typo to TFBaseModelOutput

* update T5 doc with T5ModelEncoder

* add T5ModelEncoder to tests

* finish pytorch

* finish docs and mt5

* add mtf to init

* fix init

* remove n_positions

* finish PR

* Update src/transformers/models/mt5/modeling_mt5.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/t5/modeling_t5.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/t5/modeling_tf_t5.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/mt5/modeling_tf_mt5.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

40ecaf0c

29 Nov, 2020 1 commit

[Pegasus] Refactor Tokenizer (#8731) · 5ced23dc

Patrick von Platen authored Nov 29, 2020

* refactor

* further refactor

* fix the rest tomorrow

* save intermediate

* finish slow tokenizer

* make more tests pass

* finish refactor

* fix comment

* clean further

* fix name

* fix naming

* Update src/transformers/models/reformer/tokenization_reformer.py

* Apply suggestions from code review

* Apply suggestions from code review

* refactor

* fix init tokenizers

* refactor

* improve convert

* refactor

* correct convert slow tokenizer

* final fix for Pegasus Tok

* remove ipdb

* improve links

5ced23dc

27 Nov, 2020 6 commits

Model parallel tests should return, not pass in non model parallel settings. (#8825) · 18c32eeb
Lysandre Debut authored Nov 27, 2020

18c32eeb

BART & FSMT: fix decoder not returning hidden states from the last layer (#8597) · 0a921b64

Max Del authored Nov 27, 2020



* Fix decoder not returning hidden states from the last layer

* Resolve conflict

* Change the way to gather hidden states

* Add decoder hidden states test

* Make pytest and black happy

* Remove redundant line

* remove new line
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

0a921b64

Add barthez model (#8393) · 81fe0bf0

Moussa Kamal Eddine authored Nov 27, 2020



* Add init barthez

* Add barthez model, tokenizer and docs

BARThez is a pre-trained french seq2seq model that uses BART objective.

* Apply suggestions from code review docs typos
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add license

* Change URLs scheme

* Remove barthez model keep tokenizer

* Fix style

* Fix quality

* Update tokenizer

* Add fast tokenizer

* Add fast tokenizer test
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

81fe0bf0

Fix dpr<>bart config for RAG (#8808) · a7d46a06

Patrick von Platen authored Nov 27, 2020

* correct dpr test and bert pos fault

* fix dpr bert config problem

* fix layoutlm

* add config to dpr as well

a7d46a06

[Flax test] Add require pytorch to flix flax test (#8816) · a2cf3759
Patrick von Platen authored Nov 27, 2020
```
* try flax fix

* same for roberta
```
a2cf3759

[FlaxBert] Fix non-broadcastable attention mask for batched forward-passes (#8791) · f8eda599

Kristian Holsheimer authored Nov 27, 2020

* [FlaxBert] Fix non-broadcastable attention mask for batched forward-passes

* [FlaxRoberta] Fix non-broadcastable attention mask

* Use jax.numpy instead of ordinary numpy (otherwise not jit-able)

* Partially revert "Use jax.numpy ..."

* Add tests for batched forward passes

* Avoid unnecessary OOMs due to preallocation of GPU memory by XLA

* Auto-fix style

* Re-enable GPU memory preallocation but with mem fraction < 1/paralleism

f8eda599

25 Nov, 2020 3 commits

[XLNet] Fix mems behavior (#8567) · 2a6fbe6a

Patrick von Platen authored Nov 25, 2020

* fix mems in xlnet

* fix use_mems

* fix use_mem_len

* fix use mems

* clean docs

* fix tf typo

* make xlnet tf for generation work

* fix tf test

* refactor use cache

* add use cache for missing models

* correct use_cache in generate

* correct use cache in tf generate

* fix tf

* correct getattr typo

* make sylvain happy

* change in docs as well

* do not apply to cookie cutter statements

* fix tf test

* make pytorch model fully backward compatible

2a6fbe6a

Return correct Bart hidden state tensors (#8747) · 369f1d77

Joe Davison authored Nov 25, 2020



* bart output hidden states upstream

* same w/ decoder

* add tests

* fix prophetnet

* fix gpt2 and ctrl

* fix fstm and skip test for reformer and longformer

* fix all models
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

369f1d77

Fix QA argument handler (#8765) · 138f45c1

Lysandre Debut authored Nov 25, 2020



* Fix QA argument handler

* Attempt to get a better fix for QA (#8768)
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

138f45c1

24 Nov, 2020 5 commits

New TF model inputs (#8602) · 29d49924

Julien Plu authored Nov 24, 2020

* Apply on BERT and ALBERT

* Update TF Bart

* Add input processing to TF BART

* Add input processing for TF CTRL

* Add input processing to TF Distilbert

* Add input processing to TF DPR

* Add input processing to TF Electra

* Add input processing for TF Flaubert

* Add deprecated arguments

* Add input processing to TF XLM

* remove unused imports

* Add input processing to TF Funnel

* Add input processing to TF GPT2

* Add input processing to TF Longformer

* Add input processing to TF Lxmert

* Apply style

* Add input processing to TF Mobilebert

* Add input processing to TF GPT

* Add input processing to TF Roberta

* Add input processing to TF T5

* Add input processing to TF TransfoXL

* Apply style

* Rebase on master

* Bug fix

* Retry to bugfix

* Retry bug fix

* Fix wrong model name

* Try another fix

* Fix BART

* Fix input precessing

* Apply style

* Put the deprecated warnings in the input processing function

* Remove the unused imports

* Raise an error when len(kwargs)>0

* test ModelOutput instead of TFBaseModelOutput

* Bug fix

* Address Patrick's comments

* Address Patrick's comments

* Address Sylvain's comments

* Add the new inputs in new Longformer models

* Update the template with the new input processing

* Remove useless assert

* Apply style

* Trigger CI

29d49924

[core] implement support for run-time dependency version checking (#8645) · 82d443a7

Stas Bekman authored Nov 24, 2020



* implement support for run-time dependency version checking

* try not escaping !

* use findall that works on py36

* small tweaks

* autoformatter worship

* simplify

* shorter names

* add support for non-versioned checks

* add deps

* revert

* tokenizers not required, check version only if installed

* make a proper distutils cmd and add make target

* tqdm must be checked before tokenizers

* workaround the DistributionNotFound peculiar setup

* handle the rest of packages in setup.py

* fully sync setup.py's install_requires - to check them all

* nit

* make install_requires more readable

* typo

* Update setup.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* restyle

* add types

* simplify

* simplify2
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

82d443a7

MT5 should have an autotokenizer (#8743) · e09e54fd

Lysandre Debut authored Nov 24, 2020

* MT5 should have an autotokenizer

* Different configurations should be able to point to same tokenizers

e09e54fd

Fix slow tests v2 (#8746) · 6fdd0bb2

Lysandre Debut authored Nov 24, 2020

* Fix BART test

* Fix MBART tests

* Remove erroneous line from yaml

* Update tests/test_modeling_bart.py

* Quality

6fdd0bb2

Support various BERT relative position embeddings (2nd) (#8276) · 2c83b3c3

zhiheng-huang authored Nov 24, 2020



* Support BERT relative position embeddings

* Fix typo in README.md

* Address review comment

* Fix failing tests

* [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py

* make fix copies

* fix configs of electra and albert and fix longformer

* remove copy statement from longformer

* fix albert

* fix electra

* Add bert variants forward tests for various position embeddings

* [tiny] Fix style for test_modeling_bert.py

* improve docstring

* [tiny] improve docstring and remove unnecessary dependency

* [tiny] Remove unused import

* re-add to ALBERT

* make embeddings work for ALBERT

* add test for albert
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

2c83b3c3

23 Nov, 2020 7 commits

TF BERT test update · 7f2c0091
LysandreJik authored Nov 23, 2020

7f2c0091
Update TF BERT test · e1b7e10d
LysandreJik authored Nov 23, 2020

e1b7e10d

Add early stopping callback to pytorch trainer (#8581) · 8ffc01a7

Colin Brochtrup authored Nov 23, 2020

* Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer

* Add early stopping test

* Set patience counter to 0 if best metric not defined yet

* Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on.

* Run make style

* make funciton name sensible

* Improve new argument docstring wording and hope that flakey CI test passes.

* Use on_evaluation callback instead of custom. Remove some debug printing

* Move early stopping arguments and state into early stopping callback

* Run make style

* Remove old code

* Fix docs formatting. make style went rogue on me.

* Remove copied attributes and fix variable

* Add assertions on training arguments instead of mutating them. Move comment out of public docs.

* Make separate test for early stopping callback. Add test of invalid arguments.

* Run make style... I remembered before CI this time!

* appease flake8

* Add EarlyStoppingCallback to callback docs

* Make docstring EarlyStoppingCallabck match other callbacks.

* Fix typo in docs

8ffc01a7

consistent ignore keys + make private (#8737) · e84786aa

Stas Bekman authored Nov 23, 2020

* consistent ignore keys + make private

* style

* - authorized_missing_keys    => _keys_to_ignore_on_load_missing
  - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected

* move public doc of private attributes to private comment

e84786aa

gpt2 and t5 parallel modeling (#8696) · 1cd9be2a

alexorona authored Nov 23, 2020



* gpt2 and t5 parallel modeling

* model_parallel utils update

* adding missing model_parallel_utils

Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5

* training_args reformat

Reformatted training_args

* style formatting

Style formatting doc string length on training_args and model_parallel_utils

* style changes

make style && make quality for training_args and model_parallel_utils.

* adding tests

* minor change in trainer

reverts loss calculation

* Update training_args.py

* Update training_args.py

added back docstring language for adam_beta1 and adam_beta2

* Update trainer.py

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style & rebase
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

1cd9be2a

Improve bert-japanese tokenizer handling (#8659) · 0cc5ab13

Julien Chaumond authored Nov 23, 2020



* Make ci fail

* Try to make tests actually run?

* CI finally failing?

* Fix CI

* Revert "Fix CI"

This reverts commit ca7923be7334d4e571b023478ebdd6b33dfd0ebb.

* Ooops wrong one

* one more try

* Ok ok let's move this elsewhere

* Alternative to globals() (#8667)

* Alternative to globals()

* Error is raised later so return None

* Sentencepiece not installed make some tokenizers None

* Apply Lysandre wisdom

* Slightly clearer comment?

cc @sgugger
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0cc5ab13

Fix bug in x-attentions output for roberta and harden test to catch it (#8660) · 18c8cf00
Yossi Synett authored Nov 23, 2020

18c8cf00

20 Nov, 2020 1 commit
- fix flaky ci (#8694) · 9c0afdaf
  Patrick von Platen authored Nov 20, 2020
  
  9c0afdaf
19 Nov, 2020 6 commits

Add sentencepiece to the CI and fix tests (#8672) · 6494910f
Sylvain Gugger authored Nov 19, 2020
```
* Fix the CI and tests

* Fix quality

* Remove that m form nowhere
```
6494910f
[tokenizers] convert_to_tensors: don't reconvert when the type is already right (#8283) · 42111f1d
Stas Bekman authored Nov 19, 2020
```
* don't reconvert when the type is already right

* better name

* adjust logic as suggested

* merge
```
42111f1d

`disable_ngram_loss` fix for prophetnet (#8554) · ca0109bd

Zhylko Dima authored Nov 19, 2020



* `disable_ngram_loss` fix for prophetnet

* add changes documentation

* fix _compute_loss to use mean reduction and -100 to masked tokens & remove unnecessary arguments

* mean label smoothing loss

* small refactor

* fix test
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

ca0109bd

Better filtering of the model outputs in Trainer (#8633) · 4208f496
Sylvain Gugger authored Nov 19, 2020
```
* Better filtering of the model outputs in Trainer

* Fix examples tests

* Add test for Lysandre
```
4208f496

Fix a bunch of slow tests (#8634) · f2e07e72

Lysandre Debut authored Nov 19, 2020



* CI should install `sentencepiece`

* Requiring TF

* Fixing some TFDPR bugs

* remove return_dict=False/True hack
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

f2e07e72

Tf longformer for sequence classification (#8231) · 5362bb8a

elk-cloner authored Nov 19, 2020



* working on LongformerForSequenceClassification

* add TFLongformerForMultipleChoice

* add TFLongformerForTokenClassification

* use add_start_docstrings_to_model_forward

* test TFLongformerForSequenceClassification

* test TFLongformerForMultipleChoice

* test TFLongformerForTokenClassification

* remove test from repo

* add test and doc for TFLongformerForSequenceClassification, TFLongformerForTokenClassification, TFLongformerForMultipleChoice

* add requested classes to modeling_tf_auto.py
update dummy_tf_objects
fix tests
fix bugs in requested classes

* pass all tests except test_inputs_embeds

* sync with master

* pass all tests except test_inputs_embeds

* pass all tests

* pass all tests

* work on test_inputs_embeds

* fix style and quality

* make multi choice work

* fix TFLongformerForTokenClassification signature

* fix TFLongformerForMultipleChoice, TFLongformerForSequenceClassification signature

* fix mult choice

* fix mc hint

* fix input embeds

* fix input embeds

* refactor input embeds

* fix copy issue

* apply sylvains changes and clean more
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

5362bb8a