Commits · 2a6fbe6a409de1a28a4b8f4510d2d70b66f6be9b · chenpangpang / transformers

25 Nov, 2020 4 commits

[XLNet] Fix mems behavior (#8567) · 2a6fbe6a

Patrick von Platen authored Nov 25, 2020

* fix mems in xlnet

* fix use_mems

* fix use_mem_len

* fix use mems

* clean docs

* fix tf typo

* make xlnet tf for generation work

* fix tf test

* refactor use cache

* add use cache for missing models

* correct use_cache in generate

* correct use cache in tf generate

* fix tf

* correct getattr typo

* make sylvain happy

* change in docs as well

* do not apply to cookie cutter statements

* fix tf test

* make pytorch model fully backward compatible

2a6fbe6a

Return correct Bart hidden state tensors (#8747) · 369f1d77

Joe Davison authored Nov 25, 2020



* bart output hidden states upstream

* same w/ decoder

* add tests

* fix prophetnet

* fix gpt2 and ctrl

* fix fstm and skip test for reformer and longformer

* fix all models
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

369f1d77

Fix QA argument handler (#8765) · 138f45c1

Lysandre Debut authored Nov 25, 2020



* Fix QA argument handler

* Attempt to get a better fix for QA (#8768)
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

138f45c1

Big model table (#8774) · 4821ea5a

Sylvain Gugger authored Nov 25, 2020



* First draft

* Styling

* With all changes staged

* Update docs/source/index.rst
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Styling
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

4821ea5a

24 Nov, 2020 10 commits

Create README.md (#8761) · 90d5ab3b
Manuel Romero authored Nov 24, 2020

90d5ab3b

New TF model inputs (#8602) · 29d49924

Julien Plu authored Nov 24, 2020

* Apply on BERT and ALBERT

* Update TF Bart

* Add input processing to TF BART

* Add input processing for TF CTRL

* Add input processing to TF Distilbert

* Add input processing to TF DPR

* Add input processing to TF Electra

* Add input processing for TF Flaubert

* Add deprecated arguments

* Add input processing to TF XLM

* remove unused imports

* Add input processing to TF Funnel

* Add input processing to TF GPT2

* Add input processing to TF Longformer

* Add input processing to TF Lxmert

* Apply style

* Add input processing to TF Mobilebert

* Add input processing to TF GPT

* Add input processing to TF Roberta

* Add input processing to TF T5

* Add input processing to TF TransfoXL

* Apply style

* Rebase on master

* Bug fix

* Retry to bugfix

* Retry bug fix

* Fix wrong model name

* Try another fix

* Fix BART

* Fix input precessing

* Apply style

* Put the deprecated warnings in the input processing function

* Remove the unused imports

* Raise an error when len(kwargs)>0

* test ModelOutput instead of TFBaseModelOutput

* Bug fix

* Address Patrick's comments

* Address Patrick's comments

* Address Sylvain's comments

* Add the new inputs in new Longformer models

* Update the template with the new input processing

* Remove useless assert

* Apply style

* Trigger CI

29d49924

[core] implement support for run-time dependency version checking (#8645) · 82d443a7

Stas Bekman authored Nov 24, 2020



* implement support for run-time dependency version checking

* try not escaping !

* use findall that works on py36

* small tweaks

* autoformatter worship

* simplify

* shorter names

* add support for non-versioned checks

* add deps

* revert

* tokenizers not required, check version only if installed

* make a proper distutils cmd and add make target

* tqdm must be checked before tokenizers

* workaround the DistributionNotFound peculiar setup

* handle the rest of packages in setup.py

* fully sync setup.py's install_requires - to check them all

* nit

* make install_requires more readable

* typo

* Update setup.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* restyle

* add types

* simplify

* simplify2
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

82d443a7

fix rag index names in eval_rag.py example (#8730) · a7d73cfd
Quentin Lhoest authored Nov 24, 2020

a7d73cfd

added instructions for syncing upstream master with forked master via PR (#8745) · 8d4ed7e9

Binoy Dalal authored Nov 24, 2020



* added instructions for syncing upstream master with forked master via PR

* expand to add a note to why this is requested
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

8d4ed7e9

MT5 should have an autotokenizer (#8743) · e09e54fd

Lysandre Debut authored Nov 24, 2020

* MT5 should have an autotokenizer

* Different configurations should be able to point to same tokenizers

e09e54fd

Fix slow tests v2 (#8746) · 6fdd0bb2

Lysandre Debut authored Nov 24, 2020

* Fix BART test

* Fix MBART tests

* Remove erroneous line from yaml

* Update tests/test_modeling_bart.py

* Quality

6fdd0bb2

Support various BERT relative position embeddings (2nd) (#8276) · 2c83b3c3

zhiheng-huang authored Nov 24, 2020



* Support BERT relative position embeddings

* Fix typo in README.md

* Address review comment

* Fix failing tests

* [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py

* make fix copies

* fix configs of electra and albert and fix longformer

* remove copy statement from longformer

* fix albert

* fix electra

* Add bert variants forward tests for various position embeddings

* [tiny] Fix style for test_modeling_bert.py

* improve docstring

* [tiny] improve docstring and remove unnecessary dependency

* [tiny] Remove unused import

* re-add to ALBERT

* make embeddings work for ALBERT

* add test for albert
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

2c83b3c3

[EsperBERTo] Fix URLs to assets · 9e71aa2f
Julien Chaumond authored Nov 24, 2020

9e71aa2f
Model parallel documentation (#8741) · 02f48b9b
Lysandre Debut authored Nov 23, 2020
```
* Add parallelize methods to the .rst files

* Correct format
```
02f48b9b

23 Nov, 2020 17 commits

TF BERT test update · 7f2c0091
LysandreJik authored Nov 23, 2020

7f2c0091
Update TF BERT test · e1b7e10d
LysandreJik authored Nov 23, 2020

e1b7e10d

Add early stopping callback to pytorch trainer (#8581) · 8ffc01a7

Colin Brochtrup authored Nov 23, 2020

* Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer

* Add early stopping test

* Set patience counter to 0 if best metric not defined yet

* Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on.

* Run make style

* make funciton name sensible

* Improve new argument docstring wording and hope that flakey CI test passes.

* Use on_evaluation callback instead of custom. Remove some debug printing

* Move early stopping arguments and state into early stopping callback

* Run make style

* Remove old code

* Fix docs formatting. make style went rogue on me.

* Remove copied attributes and fix variable

* Add assertions on training arguments instead of mutating them. Move comment out of public docs.

* Make separate test for early stopping callback. Add test of invalid arguments.

* Run make style... I remembered before CI this time!

* appease flake8

* Add EarlyStoppingCallback to callback docs

* Make docstring EarlyStoppingCallabck match other callbacks.

* Fix typo in docs

8ffc01a7

Fix max length in run_plm script (#8738) · 367f497d
Sylvain Gugger authored Nov 23, 2020

367f497d

consistent ignore keys + make private (#8737) · e84786aa

Stas Bekman authored Nov 23, 2020

* consistent ignore keys + make private

* style

* - authorized_missing_keys    => _keys_to_ignore_on_load_missing
  - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected

* move public doc of private attributes to private comment

e84786aa

Document new training argument · 49759c0c
Sylvain Gugger authored Nov 23, 2020

49759c0c

gpt2 and t5 parallel modeling (#8696) · 1cd9be2a

alexorona authored Nov 23, 2020



* gpt2 and t5 parallel modeling

* model_parallel utils update

* adding missing model_parallel_utils

Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5

* training_args reformat

Reformatted training_args

* style formatting

Style formatting doc string length on training_args and model_parallel_utils

* style changes

make style && make quality for training_args and model_parallel_utils.

* adding tests

* minor change in trainer

reverts loss calculation

* Update training_args.py

* Update training_args.py

added back docstring language for adam_beta1 and adam_beta2

* Update trainer.py

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style & rebase
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

1cd9be2a

[trainer] make generate work with multigpu (#8716) · 1e45bef0
Stas Bekman authored Nov 23, 2020
```
* make generate work with multigpu

* better fix - thanks @sgugger
```
1e45bef0

Change default cache path (#8734) · 90002427

Sylvain Gugger authored Nov 23, 2020



* Change default cache path

* Document changes

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

90002427

Improve bert-japanese tokenizer handling (#8659) · 0cc5ab13

Julien Chaumond authored Nov 23, 2020



* Make ci fail

* Try to make tests actually run?

* CI finally failing?

* Fix CI

* Revert "Fix CI"

This reverts commit ca7923be7334d4e571b023478ebdd6b33dfd0ebb.

* Ooops wrong one

* one more try

* Ok ok let's move this elsewhere

* Alternative to globals() (#8667)

* Alternative to globals()

* Error is raised later so return None

* Sentencepiece not installed make some tokenizers None

* Apply Lysandre wisdom

* Slightly clearer comment?

cc @sgugger
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0cc5ab13

[model_cards]: control input examples of Geotrend models (#8727) · eec76615

Amine Abdaoui authored Nov 23, 2020

* [model_cards]: control arabic model examples

* [model_cards]: control input examples of Geotrend models

* [model_cards]: add link to generatation script

eec76615

Add pip install update to resolve import error in transformers notebook (#8616) · 143b564e

Jessica Yung authored Nov 23, 2020



* Add pip install update to resolve import error

Add pip install upgrade tensorflow-gpu to remove error below:
```
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-094fadb93f3f> in <module>()
      1 import torch
----> 2 from transformers import AutoModel, AutoTokenizer, BertTokenizer
      3 
      4 torch.set_grad_enabled(False)

4 frames
/usr/local/lib/python3.6/dist-packages/transformers/__init__.py in <module>()
    133 
    134 # Pipelines
--> 135 from .pipelines import (
    136     Conversation,
    137     ConversationalPipeline,

/usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in <module>()
     46     import tensorflow as tf
     47 
---> 48     from .modeling_tf_auto import (
     49         TF_MODEL_FOR_QUESTION_ANSWERING_MAPPING,
     50         TF_MODEL_FOR_SEQ_TO_SEQ_CAUSAL_LM_MAPPING,

/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_auto.py in <module>()
     49 from .configuration_utils import PretrainedConfig
     50 from .file_utils import add_start_docstrings
---> 51 from .modeling_tf_albert import (
     52     TFAlbertForMaskedLM,
     53     TFAlbertForMultipleChoice,

/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_albert.py in <module>()
     22 import tensorflow as tf
     23 
---> 24 from .activations_tf import get_tf_activation
     25 from .configuration_albert import AlbertConfig
     26 from .file_utils import (

/usr/local/lib/python3.6/dist-packages/transformers/activations_tf.py in <module>()
     52     "gelu": tf.keras.layers.Activation(gelu),
     53     "relu": tf.keras.activations.relu,
---> 54     "swish": tf.keras.activations.swish,
     55     "silu": tf.keras.activations.swish,
     56     "gelu_new": tf.keras.layers.Activation(gelu_new),

AttributeError: module 'tensorflow_core.python.keras.api._v2.keras.activations' has no attribute 'swish'
```
I have tried running the colab after this change and it seems to work fine (all the cells run with no errors).

* Update notebooks/02-transformers.ipynb

only need to upgrade tensorflow, not tensorflow-gpu.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

143b564e

Fix bug in x-attentions output for roberta and harden test to catch it (#8660) · 18c8cf00
Yossi Synett authored Nov 23, 2020

18c8cf00
[model_cards] Add card for gpt2-rnm (#8673) · 48cc2247
Tony authored Nov 23, 2020

48cc2247

create README.md (#8682) · 52585e40

Nguyen Van Nha authored Nov 23, 2020



* create README.md

* Apply suggestions from code review
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

52585e40

added bangla-bert-sentiment model card (#8687) · b5187e31
Sagor Sarker authored Nov 23, 2020

b5187e31

Create README.md (#8630) · b6d864e2

moniquebm authored Nov 23, 2020



* Create README.md

* correct metrics id

cc @lhoestq
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

b6d864e2

22 Nov, 2020 1 commit
- Fix many typos (#8708) · e1f3156b
  Santiago Castro authored Nov 22, 2020
  
  e1f3156b
20 Nov, 2020 6 commits

fix flaky ci (#8694) · 9c0afdaf
Patrick von Platen authored Nov 20, 2020

9c0afdaf

Vectorize RepetitionPenaltyLogitsProcessor to improve performance (#8598) · 29bdb883

Binoy Dalal authored Nov 20, 2020

* refactored exisiting nested loops to vectorized implementation

* replaced explicit indexing with torch.where

* modifying score for previous input_ids only

29bdb883

moved temperature wrapper before topP/topK (#8686) · 2594bd8b
Roman Kalyakin authored Nov 20, 2020

2594bd8b

Fix rag finetuning + add finetuning test (#8585) · 8062fa63

Quentin Lhoest authored Nov 20, 2020

* replace init_ddp_connection for index init

* style

* add finetune test

* add test data

* move generate tensors to device

* add test on EM metric

* style

* allow multi process test

* keep gloo process group for retrieval

* add multi-gpu test

* use custom accelerator

* clean test finetune

* minor

* style

* style

* typo

* use python call instead of imported main fumction

* return_dict fix in modeling_rag

* use float32 in retrieval

* store as float32 as well in the custom knowledge dataset example

* style

* rename to finetune_rag

* style

* update readme

* rename utils and callbacks to utils_rag and callbacks_rag

* fix test

* patrick's comments

* generate dummy data in the finetue test script

* remove dummy data files

* style

8062fa63

Document adam betas TrainingArguments (#8688) · 63e91f5f
Sylvain Gugger authored Nov 20, 2020

63e91f5f
Update the bibtex with EMNLP demo (#8678) · 94caaa93
Kevin Canwen Xu authored Nov 20, 2020
```
* Update the bibtex with EMNLP demo

* Update README.md

* Update README.md
```
94caaa93

19 Nov, 2020 2 commits
- Add sentencepiece to the CI and fix tests (#8672) · 6494910f
  Sylvain Gugger authored Nov 19, 2020
```
* Fix the CI and tests

* Fix quality

* Remove that m form nowhere
```
  6494910f
- [examples/seq2seq] fix PL deprecation warning (#8577) · 0ad45e10
  Stas Bekman authored Nov 19, 2020
```
* fix deprecation warning

* fix
```
  0ad45e10