Commits · 08f534d2da47875a4b7eb1c125cfa7f0f3b79642 · chenpangpang / transformers

26 Oct, 2020 10 commits

Sylvain Gugger authored Oct 26, 2020

* Important files

* Styling them all

* Revert "Styling them all"

This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e.

* Syling them for realsies

* Fix syntax error

* Fix benchmark_utils

* More fixes

* Fix modeling auto and script

* Remove new line

* Fixes

* More fixes

* Fix more files

* Style

* Add FSMT

* More fixes

* More fixes

* More fixes

* More fixes

* Fixes

* More fixes

* More fixes

* Last fixes

* Make sphinx happy

08f534d2

Doc fixes in preparation for the docstyle PR (#8061) · 04a17f85

Sylvain Gugger authored Oct 26, 2020

* Fixes in preparation for doc styling

* More fixes

* Better syntax

* Fixes

* Style

* More fixes

* More fixes

04a17f85

Fix TF training arguments instantiation (#8063) · 3a107645
Lysandre Debut authored Oct 26, 2020

3a107645
[TF] from_pt should respect authorized_unexpected_keys (#8056) · bc9332b5
Sam Shleifer authored Oct 26, 2020

bc9332b5
fixing crash (#8057) · 7ff7c493
Stas Bekman authored Oct 26, 2020

7ff7c493
Fix + Test (#8049) · cbad90d8
Lysandre Debut authored Oct 26, 2020

cbad90d8

Mlflow integration callback (#8016) · c48b16b8

noise-field authored Oct 26, 2020

* Add MLflow integration class

Add integration code for MLflow in integrations.py along with the code
that checks that MLflow is installed.

* Add MLflowCallback import

Add import of MLflowCallback in trainer.py

* Handle model argument

Allow the callback to handle model argument and store model config items as hyperparameters.

* Log parameters to MLflow in batches

MLflow cannot log more than a hundred parameters at once.
Code added to split the parameters into batches of 100 items and log the batches one by one.

* Fix style

* Add docs on MLflow callback

* Fix issue with unfinished runs

The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created.

* Add MLflow integration class

Add integration code for MLflow in integrations.py along with the code
that checks that MLflow is installed.

* Add MLflowCallback import

Add import of MLflowCallback in trainer.py

* Handle model argument

Allow the callback to handle model argument and store model config items as hyperparameters.

* Log parameters to MLflow in batches

MLflow cannot log more than a hundred parameters at once.
Code added to split the parameters into batches of 100 items and log the batches one by one.

* Fix style

* Add docs on MLflow callback

* Fix issue with unfinished runs

The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created.

c48b16b8

Tiny TF Bart fixes (#8023) · 8be9cb0a
Lysandre Debut authored Oct 26, 2020

8be9cb0a
Add mixed precision evaluation (#8036) · c153bcc5
luyug authored Oct 26, 2020
```
* Add mixed precision evaluation

* use original flag
```
c153bcc5
[tokenizers] Fixing #8001 - Adding tests on tokenizers serialization (#8006) · 79eb3915
Thomas Wolf authored Oct 26, 2020
```
* fixing #8001

* make T5 tokenizer serialization more robust - style
```
79eb3915

24 Oct, 2020 1 commit
- [doc prepare_seq2seq_batch] fix docs (#8013) · 38f6739c
  Suraj Patil authored Oct 25, 2020
  
  38f6739c
23 Oct, 2020 3 commits

Fix BatchEncoding.word_to_tokens for removed tokens (#7939) · 5e323017
Anthony MOI authored Oct 23, 2020

5e323017
[Reformer] remove reformer pad_token_id (#7991) · 4acfd1a8
Patrick von Platen authored Oct 23, 2020
```
* remove reformer pad_token_id

* fix pegasus
```
4acfd1a8

[tests|tokenizers] Refactoring pipelines test backbone - Small tokenizers... · 3a40cdf5

Thomas Wolf authored Oct 23, 2020


[tests|tokenizers] Refactoring pipelines test backbone - Small tokenizers improvements - General tests speedups (#7970)

* WIP refactoring pipeline tests - switching to fast tokenizers

* fix dialog pipeline and fill-mask

* refactoring pipeline tests backbone

* make large tests slow

* fix tests (tf Bart inactive for now)

* fix doc...

* clean up for merge

* fixing tests - remove bart from summarization until there is TF

* fix quality and RAG

* Add new translation pipeline tests - fix JAX tests

* only slow for dialog

* Fixing the missing TF-BART imports in modeling_tf_auto

* spin out pipeline tests in separate CI job

* adding pipeline test to CI YAML

* add slow pipeline tests

* speed up tf and pt join test to avoid redoing all the standalone pt and tf tests

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update src/transformers/pipelines.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/pipelines.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add require_torch and require_tf in is_pt_tf_cross_test
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

3a40cdf5

22 Oct, 2020 11 commits

Move NoLayerEmbedTokens (#7945) · 0397619a
Sam Shleifer authored Oct 22, 2020
```
* Move NoLayerEmbedTokens

* TFWrappedEmbeddings

* Add comment
```
0397619a
Reload checkpoint (#7984) · 5ae935d2
Sylvain Gugger authored Oct 22, 2020
```
* Fix checkpoint loading in Trainer

* Fix typo
```
5ae935d2
Only log total_flos at the end of training (#7981) · 06fc3954
Sylvain Gugger authored Oct 22, 2020
```
* Only log total_flos at the end of training

* Fix test
```
06fc3954

FillMaskPipeline: support passing top_k on __call__ (#7971) · ff65beaf

Julien Chaumond authored Oct 22, 2020

* FillMaskPipeline: support passing top_k on __call__

Also move from topk to top_k

* migrate to new param name in tests

* Review from @sgugger

ff65beaf

New run glue script (#7917) · 2e5052d4

Sylvain Gugger authored Oct 22, 2020



* Start simplification

* More progress

* Finished script

* Address comments and update tests instructions

* Wrong test

* Accept files as inputs and fix test

* Update src/transformers/trainer_utils.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Fix labels and add combined score

* Add special labels

* Update TPU command

* Revert to old label strategy

* Use model labels

* Fix for STT-B

* Styling

* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Code styling

* Fix review comments
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

2e5052d4

Fixing the "translation", "translation_XX_to_YY" pipelines. (#7975) · 18ce6b8f

Nicolas Patry authored Oct 22, 2020



* Actually make the "translation", "translation_XX_to_YY" task behave correctly.

Background:
- Currently "translation_cn_to_ar" does not work. (only 3 pairs are
supported)
- Some models, contain in their config the correct values for the (src,
tgt) pair they can translate. It's usually just one pair, and we can
infer it automatically from the `model.config.task_specific_params`. If
it's not defined we can still probably load the TranslationPipeline
nevertheless.

Proposed fix:
- A simplified version of what could become more general which is
a `parametrized` task. "translation" + (src, tgt) in this instance
it what we need in the general case. The way we go about it for now
is simply parsing "translation_XX_to_YY". If cases of parametrized task arise
we should preferably go in something closer to what `datasets` propose
which is having a secondary argument `task_options`? that will be close
to what that task requires.
- Should be backward compatible in all cases for instance
`pipeline(task="translation_en_to_de") should work out of the box.
- Should provide a warning when a specific translation pair has been
selected on behalf of the user using
`model.config.task_specific_params`.

* Update src/transformers/pipelines.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

18ce6b8f

Remove the else branch adding 0 to the hidden state if token_type_embeds is None. (#7977) · 901e9b8e
Funtowicz Morgan authored Oct 22, 2020
```
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
```
901e9b8e

[PretrainedConfig] Fix save pretrained config for edge case (#7943) · f34372a9

Patrick von Platen authored Oct 22, 2020



* fix config save

* add test

* add config class variable and another test

* line break

* fix fsmt and typo

* god am I making many errors today :-/

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

f34372a9

# Add whole word mask support for lm fine-tune (#7925) · a16e568f

wlhgtc authored Oct 22, 2020



* ADD: add whole word mask proxy for both eng and chinese

* MOD: adjust format

* MOD: reformat code

* MOD: update import

* MOD: fix bug

* MOD: add import

* MOD: fix bug

* MOD: decouple code and update readme

* MOD: reformat code

* Update examples/language-modeling/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change wwm to whole_word_mask

* reformat code

* reformat

* format

* Code quality

* ADD: update chinese ref readme

* MOD: small changes

* MOD: small changes2

* update readme
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

a16e568f

support relative path for best_model_checkpoint (#7973) · f774b2e8
Haebin Shin authored Oct 22, 2020

f774b2e8
Herbert tokenizer auto load (#7968) · 95792a94
rmroczkowski authored Oct 22, 2020

95792a94

21 Oct, 2020 5 commits

fix 'encode_plus' docstring for 'special_tokens_mask' (0s and 1s were reversed) (#7949) · 16da8771
Evan Pete Walsh authored Oct 21, 2020
```
* fix docstring for 'special_tokens_mask'

* revert auto formatter changes

* revert another auto format

* revert another auto format
```
16da8771
[ProphetNet] Correct Doc string example (#7944) · 9b6610f7
Patrick von Platen authored Oct 21, 2020
```
* correct xlm prophetnet auto model and examples

* fix line-break docs
```
9b6610f7

TensorBoard/Wandb/optuna/raytune integration improvements. (#7935) · e174bfeb

François Lagunas authored Oct 21, 2020

Improved TensorBoard and Wandb integration, as well as optuna and ray/tune support, with minor modifications to trainer core code.

e174bfeb

[multiple models] skip saving/loading deterministic state_dict keys (#7878) · 57516c0c

Stas Bekman authored Oct 21, 2020

* make the save_load special key tests common

* handle mbart

* cleaner solution

* fix

* move test_save_load_missing_keys back into fstm for now

* restore

* style

* add marian

* add pegasus

* blenderbot

* revert - no static embed

57516c0c

Add TFBartForConditionalGeneration (#5411) · 82984215

Sam Shleifer authored Oct 21, 2020



* half done

* doc improvement

* Cp test file

* brokedn

* broken test

* undo some mess

* ckpt

* borked

* Halfway

* 6 passing

* boom boom

* Much progress but still 6

* boom boom

* merged master

* 10 passing

* boom boom

* Style

* no t5 changes

* 13 passing

* Integration test failing, but not gibberish

* Frustrated

* Merged master

* 4 fail

* 4 fail

* fix return_dict

* boom boom

* Still only 4

* prepare method

* prepare method

* before delete classif

* Skip tests to avoid adding boilerplate

* boom boom

* fast tests passing

* style

* boom boom

* Switch to supporting many input types

* remove FIXMENORM

* working

* Fixed past_key_values/decoder_cached_states confusion

* new broken test

* Fix attention mask kwarg name

* undo accidental

* Style and reviewers

* style

* Docs and common tests

* Cleaner assert messages

* copy docs

* style issues

* Sphinx fix

* Simplify caching logic

* test does not require torch

* copy _NoLayerEmbedTokens

* Update src/transformers/modeling_tf_bart.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update tests/test_modeling_tf_bart.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_bart.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_bart.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_bart.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Line length and dont document None

* Add pipeline test coverage

* assert msg

* At parity

* Assert messages

* mark slow

* Update compile test

* back in init

* Merge master

* Fix tests
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

82984215

20 Oct, 2020 4 commits

Release: v3.4.0 · eb0e0ce2
Lysandre authored Oct 20, 2020

eb0e0ce2

Fix bug in _sorted_checkpoints (#7880) · 048dd6cf

Shai Erera authored Oct 20, 2020

I'm using transformers 3.3.1 and run a training script with `--save_total_limit 3`. I hit the exception below, and after debugging the code found that it wrongly tries to index into the `best_model_checkpoint`'s *str* rather than the `sorted_checkpoints` array. When running without the fix I got this exception:

```
Traceback (most recent call last):
  File "/<HOME>/.conda/envs/transformers/lib/python3.7/site-packages/transformers/trainer.py", line 921, in _save_training
    self._rotate_checkpoints(use_mtime=True)
  File "/<HOME>/.conda/envs/transformers/lib/python3.7/site-packages/transformers/trainer.py", line 1283, in _rotate_checkpoints
    checkpoints_sorted = self._sorted_checkpoints(use_mtime=use_mtime)
  File "/<HOME>/.conda/envs/transformers/lib/python3.7/site-packages/transformers/trainer.py", line 1274, in _sorted_checkpoints
    checkpoints_sorted[best_model_index],
TypeError: 'str' object does not support item assignment
```

048dd6cf

Add Flax dummy objects (#7918) · 6d4f8bd0
Sylvain Gugger authored Oct 20, 2020

6d4f8bd0

[testing] rename skip targets + docs (#7863) · 3e31e7f9

Stas Bekman authored Oct 20, 2020



* rename skip targets + docs

* fix quotes

* style

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* small improvements

* fix
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3e31e7f9

19 Oct, 2020 6 commits

[EncoderDecoder] Fix Typo (#7915) · c912ba5f
Patrick von Platen authored Oct 19, 2020
```
* fix encoder decoder models

* add .gitignore
```
c912ba5f
Raise error when using AMP on non-CUDA device (#7869) · 55bcd0cb
Bram Vanroy authored Oct 19, 2020
```
* Raise error when using AMP on non-CUDA device

* make style

* make style
```
55bcd0cb

`decoder_config` used before intialisation (#7903) · df1ddced

Ayub Subhaniya authored Oct 19, 2020

Seeing error when sending `decoder_config` as a parameter while initializing a encoder-decoder model from pretrained. 
fixed "UnboundLocalError: local variable 'decoder_config' referenced before assignment"

df1ddced

Allow Custom Dataset in RAG Retriever (#7763) · 033f29c6

Quentin Lhoest authored Oct 19, 2020

* add CustomHFIndex

* typo in config

* update tests

* add custom dataset example

* clean script

* update test data

* minor in test

* docs

* docs

* style

* fix imports

* allow to pass the indexed dataset directly

* update tests

* use multiset DPR

* address thom and patrick's comments

* style

* update dpr tokenizer

* add output_dir flag in use_own_knowledge_dataset.py

* allow custom datasets in examples/rag/finetune.py

* add test for custom dataset in distributed rag retriever

033f29c6

Trainer with Iterable Dataset (#7858) · a09fe140

Julien Rossi authored Oct 19, 2020

* fix 5990

* accomodate iterable dataset without predefined length
* set it as 1 use case: provide max_steps, and NO num_epochs
* Is a merge of master and PR 5995

* fix trainer test under TF

* fix only for torch
* TF trainer untouched
* trainer tests are skipped when no torch

* address comments

* fix quality checks

* remove torch.dataset from test_trainer

* unnecessary inheritance
* RegressionDataset implements all needed methods __len__ and __getitem__

* fix quality checks

* restore RegressionDataset

* was wrongly under is_torch_available()

a09fe140

ProphetNet (#7157) · 2422cda0

Weizhen authored Oct 19, 2020



* add new model prophetnet

prophetnet modified

modify codes as suggested v1

add prophetnet test files

* still bugs, because of changed output formats of encoder and decoder

* move prophetnet into the latest version

* clean integration tests

* clean tokenizers

* add xlm config to init

* correct typo in init

* further refactoring

* continue refactor

* save parallel

* add decoder_attention_mask

* fix use_cache vs. past_key_values

* fix common tests

* change decoder output logits

* fix xlm tests

* make common tests pass

* change model architecture

* add tokenizer tests

* finalize model structure

* no weight mapping

* correct n-gram stream attention mask as discussed with qweizhen

* remove unused import

* fix index.rst

* fix tests

* delete unnecessary code

* add fast integration test

* rename weights

* final weight remapping

* save intermediate

* Descriptions for Prophetnet Config File

* finish all models

* finish new model outputs

* delete unnecessary files

* refactor encoder layer

* add dummy docs

* code quality

* fix tests

* add model pages to doctree

* further refactor

* more refactor, more tests

* finish code refactor and tests

* remove unnecessary files

* further clean up

* add docstring template

* finish tokenizer doc

* finish prophetnet

* fix copies

* fix typos

* fix tf tests

* fix fp16

* fix tf test 2nd try

* fix code quality

* add test for each model

* merge new tests to branch

* Update model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update src/transformers/modeling_prophetnet.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update utils/check_repo.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* apply sams and sylvains comments

* make style

* remove unnecessary code

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/configuration_prophetnet.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* implement lysandres comments

* correct docs

* fix isort

* fix tokenizers

* fix copies
Co-authored-by: weizhen <weizhen@mail.ustc.edu.cn>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

2422cda0