Commits · 8ffc01a76ad4c446b16322c3b893a8a3f39c14c0 · chenpangpang / transformers

23 Nov, 2020 15 commits

Add early stopping callback to pytorch trainer (#8581) · 8ffc01a7

Colin Brochtrup authored Nov 23, 2020

* Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer

* Add early stopping test

* Set patience counter to 0 if best metric not defined yet

* Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on.

* Run make style

* make funciton name sensible

* Improve new argument docstring wording and hope that flakey CI test passes.

* Use on_evaluation callback instead of custom. Remove some debug printing

* Move early stopping arguments and state into early stopping callback

* Run make style

* Remove old code

* Fix docs formatting. make style went rogue on me.

* Remove copied attributes and fix variable

* Add assertions on training arguments instead of mutating them. Move comment out of public docs.

* Make separate test for early stopping callback. Add test of invalid arguments.

* Run make style... I remembered before CI this time!

* appease flake8

* Add EarlyStoppingCallback to callback docs

* Make docstring EarlyStoppingCallabck match other callbacks.

* Fix typo in docs

8ffc01a7

Fix max length in run_plm script (#8738) · 367f497d
Sylvain Gugger authored Nov 23, 2020

367f497d

consistent ignore keys + make private (#8737) · e84786aa

Stas Bekman authored Nov 23, 2020

* consistent ignore keys + make private

* style

* - authorized_missing_keys    => _keys_to_ignore_on_load_missing
  - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected

* move public doc of private attributes to private comment

e84786aa

Document new training argument · 49759c0c
Sylvain Gugger authored Nov 23, 2020

49759c0c

gpt2 and t5 parallel modeling (#8696) · 1cd9be2a

alexorona authored Nov 23, 2020



* gpt2 and t5 parallel modeling

* model_parallel utils update

* adding missing model_parallel_utils

Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5

* training_args reformat

Reformatted training_args

* style formatting

Style formatting doc string length on training_args and model_parallel_utils

* style changes

make style && make quality for training_args and model_parallel_utils.

* adding tests

* minor change in trainer

reverts loss calculation

* Update training_args.py

* Update training_args.py

added back docstring language for adam_beta1 and adam_beta2

* Update trainer.py

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style & rebase
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

1cd9be2a

[trainer] make generate work with multigpu (#8716) · 1e45bef0
Stas Bekman authored Nov 23, 2020
```
* make generate work with multigpu

* better fix - thanks @sgugger
```
1e45bef0

Change default cache path (#8734) · 90002427

Sylvain Gugger authored Nov 23, 2020



* Change default cache path

* Document changes

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

90002427

Improve bert-japanese tokenizer handling (#8659) · 0cc5ab13

Julien Chaumond authored Nov 23, 2020



* Make ci fail

* Try to make tests actually run?

* CI finally failing?

* Fix CI

* Revert "Fix CI"

This reverts commit ca7923be7334d4e571b023478ebdd6b33dfd0ebb.

* Ooops wrong one

* one more try

* Ok ok let's move this elsewhere

* Alternative to globals() (#8667)

* Alternative to globals()

* Error is raised later so return None

* Sentencepiece not installed make some tokenizers None

* Apply Lysandre wisdom

* Slightly clearer comment?

cc @sgugger
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0cc5ab13

[model_cards]: control input examples of Geotrend models (#8727) · eec76615

Amine Abdaoui authored Nov 23, 2020

* [model_cards]: control arabic model examples

* [model_cards]: control input examples of Geotrend models

* [model_cards]: add link to generatation script

eec76615

Add pip install update to resolve import error in transformers notebook (#8616) · 143b564e

Jessica Yung authored Nov 23, 2020



* Add pip install update to resolve import error

Add pip install upgrade tensorflow-gpu to remove error below:
```
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-094fadb93f3f> in <module>()
      1 import torch
----> 2 from transformers import AutoModel, AutoTokenizer, BertTokenizer
      3 
      4 torch.set_grad_enabled(False)

4 frames
/usr/local/lib/python3.6/dist-packages/transformers/__init__.py in <module>()
    133 
    134 # Pipelines
--> 135 from .pipelines import (
    136     Conversation,
    137     ConversationalPipeline,

/usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in <module>()
     46     import tensorflow as tf
     47 
---> 48     from .modeling_tf_auto import (
     49         TF_MODEL_FOR_QUESTION_ANSWERING_MAPPING,
     50         TF_MODEL_FOR_SEQ_TO_SEQ_CAUSAL_LM_MAPPING,

/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_auto.py in <module>()
     49 from .configuration_utils import PretrainedConfig
     50 from .file_utils import add_start_docstrings
---> 51 from .modeling_tf_albert import (
     52     TFAlbertForMaskedLM,
     53     TFAlbertForMultipleChoice,

/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_albert.py in <module>()
     22 import tensorflow as tf
     23 
---> 24 from .activations_tf import get_tf_activation
     25 from .configuration_albert import AlbertConfig
     26 from .file_utils import (

/usr/local/lib/python3.6/dist-packages/transformers/activations_tf.py in <module>()
     52     "gelu": tf.keras.layers.Activation(gelu),
     53     "relu": tf.keras.activations.relu,
---> 54     "swish": tf.keras.activations.swish,
     55     "silu": tf.keras.activations.swish,
     56     "gelu_new": tf.keras.layers.Activation(gelu_new),

AttributeError: module 'tensorflow_core.python.keras.api._v2.keras.activations' has no attribute 'swish'
```
I have tried running the colab after this change and it seems to work fine (all the cells run with no errors).

* Update notebooks/02-transformers.ipynb

only need to upgrade tensorflow, not tensorflow-gpu.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

143b564e

Fix bug in x-attentions output for roberta and harden test to catch it (#8660) · 18c8cf00
Yossi Synett authored Nov 23, 2020

18c8cf00
[model_cards] Add card for gpt2-rnm (#8673) · 48cc2247
Tony authored Nov 23, 2020

48cc2247

create README.md (#8682) · 52585e40

Nguyen Van Nha authored Nov 23, 2020



* create README.md

* Apply suggestions from code review
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

52585e40

added bangla-bert-sentiment model card (#8687) · b5187e31
Sagor Sarker authored Nov 23, 2020

b5187e31

Create README.md (#8630) · b6d864e2

moniquebm authored Nov 23, 2020



* Create README.md

* correct metrics id

cc @lhoestq
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

b6d864e2

22 Nov, 2020 1 commit
- Fix many typos (#8708) · e1f3156b
  Santiago Castro authored Nov 22, 2020
  
  e1f3156b
20 Nov, 2020 6 commits

fix flaky ci (#8694) · 9c0afdaf
Patrick von Platen authored Nov 20, 2020

9c0afdaf

Vectorize RepetitionPenaltyLogitsProcessor to improve performance (#8598) · 29bdb883

Binoy Dalal authored Nov 20, 2020

* refactored exisiting nested loops to vectorized implementation

* replaced explicit indexing with torch.where

* modifying score for previous input_ids only

29bdb883

moved temperature wrapper before topP/topK (#8686) · 2594bd8b
Roman Kalyakin authored Nov 20, 2020

2594bd8b

Fix rag finetuning + add finetuning test (#8585) · 8062fa63

Quentin Lhoest authored Nov 20, 2020

* replace init_ddp_connection for index init

* style

* add finetune test

* add test data

* move generate tensors to device

* add test on EM metric

* style

* allow multi process test

* keep gloo process group for retrieval

* add multi-gpu test

* use custom accelerator

* clean test finetune

* minor

* style

* style

* typo

* use python call instead of imported main fumction

* return_dict fix in modeling_rag

* use float32 in retrieval

* store as float32 as well in the custom knowledge dataset example

* style

* rename to finetune_rag

* style

* update readme

* rename utils and callbacks to utils_rag and callbacks_rag

* fix test

* patrick's comments

* generate dummy data in the finetue test script

* remove dummy data files

* style

8062fa63

Document adam betas TrainingArguments (#8688) · 63e91f5f
Sylvain Gugger authored Nov 20, 2020

63e91f5f
Update the bibtex with EMNLP demo (#8678) · 94caaa93
Kevin Canwen Xu authored Nov 20, 2020
```
* Update the bibtex with EMNLP demo

* Update README.md

* Update README.md
```
94caaa93

19 Nov, 2020 18 commits

Add sentencepiece to the CI and fix tests (#8672) · 6494910f
Sylvain Gugger authored Nov 19, 2020
```
* Fix the CI and tests

* Fix quality

* Remove that m form nowhere
```
6494910f
[examples/seq2seq] fix PL deprecation warning (#8577) · 0ad45e10
Stas Bekman authored Nov 19, 2020
```
* fix deprecation warning

* fix
```
0ad45e10

Update bert-base-multilingual-cased-README.md (#8668) · 0e19a4c2

Arindum Roy authored Nov 19, 2020

The heading was originally uncased, which did not reflect the contents of this README. Changed it to cased.

0e19a4c2

revert · 06518404
Stas Bekman authored Nov 19, 2020

06518404

Please fix your software not to ping master · 297a2938

Stas Bekman authored Nov 19, 2020

You may be unaware but you're running some software that meddles with every commit on https://github.com/huggingface/transformers/

Something is wrong with the software you're using. It adds a reference to almost every PR in the master tree. Which is very wrong. Please check your software and please don't do it again.

Example:
see the bottom of this PR and most other PRs:
https://github.com/huggingface/transformers/pull/8639

297a2938

[tokenizers] convert_to_tensors: don't reconvert when the type is already right (#8283) · 42111f1d
Stas Bekman authored Nov 19, 2020
```
* don't reconvert when the type is already right

* better name

* adjust logic as suggested

* merge
```
42111f1d
Fix run_ner script (#8664) · 20b65860
Sylvain Gugger authored Nov 19, 2020
```
* Fix run_ner script

* Pin datasets
```
20b65860

`disable_ngram_loss` fix for prophetnet (#8554) · ca0109bd

Zhylko Dima authored Nov 19, 2020



* `disable_ngram_loss` fix for prophetnet

* add changes documentation

* fix _compute_loss to use mean reduction and -100 to masked tokens & remove unnecessary arguments

* mean label smoothing loss

* small refactor

* fix test
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

ca0109bd

Merge remote-tracking branch 'origin/master' · 0603564e
Sylvain Gugger authored Nov 19, 2020

0603564e
Forgot to save... · 1e08af38
Sylvain Gugger authored Nov 19, 2020

1e08af38
Release: v4.0.0-rc-1 · d86b5ffc
LysandreJik authored Nov 19, 2020

d86b5ffc
Fix a few last paths for the new repo org (#8666) · cb3e5c33
Sylvain Gugger authored Nov 19, 2020

cb3e5c33

fix small typo (#8644) · a79a96dd

Matthias authored Nov 19, 2020

Fixed a small typo on the XLNet and permutation language modelling section

a79a96dd

Better filtering of the model outputs in Trainer (#8633) · 4208f496
Sylvain Gugger authored Nov 19, 2020
```
* Better filtering of the model outputs in Trainer

* Fix examples tests

* Add test for Lysandre
```
4208f496

Fix a bunch of slow tests (#8634) · f2e07e72

Lysandre Debut authored Nov 19, 2020



* CI should install `sentencepiece`

* Requiring TF

* Fixing some TFDPR bugs

* remove return_dict=False/True hack
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

f2e07e72

Tf longformer for sequence classification (#8231) · 5362bb8a

elk-cloner authored Nov 19, 2020



* working on LongformerForSequenceClassification

* add TFLongformerForMultipleChoice

* add TFLongformerForTokenClassification

* use add_start_docstrings_to_model_forward

* test TFLongformerForSequenceClassification

* test TFLongformerForMultipleChoice

* test TFLongformerForTokenClassification

* remove test from repo

* add test and doc for TFLongformerForSequenceClassification, TFLongformerForTokenClassification, TFLongformerForMultipleChoice

* add requested classes to modeling_tf_auto.py
update dummy_tf_objects
fix tests
fix bugs in requested classes

* pass all tests except test_inputs_embeds

* sync with master

* pass all tests except test_inputs_embeds

* pass all tests

* pass all tests

* work on test_inputs_embeds

* fix style and quality

* make multi choice work

* fix TFLongformerForTokenClassification signature

* fix TFLongformerForMultipleChoice, TFLongformerForSequenceClassification signature

* fix mult choice

* fix mc hint

* fix input embeds

* fix input embeds

* refactor input embeds

* fix copy issue

* apply sylvains changes and clean more
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

5362bb8a

fix missing return dict (#8653) · 62cd9ce9
Quentin Lhoest authored Nov 19, 2020

62cd9ce9
[model card] : fix bert-base-15lang-cased (#8655) · 0c2677f5
Amine Abdaoui authored Nov 19, 2020
```
the table was badly formatted because of a single line break
```
0c2677f5