Commits · 47a98fc4cb6a561576309a57b315b042977d194c · chenpangpang / transformers

01 Jun, 2021 12 commits

ByT5 model (#11971) · 47a98fc4

Patrick von Platen authored Jun 01, 2021



* allow tf to use uneven num of layers

* add tokenizer

* finish docs

* finish docs

* Apply suggestions from code review

* include in index

* finish

* Update docs/source/model_doc/byt5.rst
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* apply sylvais suggestions

* make style
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

47a98fc4

typo correction (#11973) · 1eb58b45
Jeoung-Minju authored Jun 02, 2021
```
* typo correction

* type corrections
```
1eb58b45
[deepspeed] docs (#11940) · 79712e7e
Stas Bekman authored Jun 01, 2021
```
* deepspeed docs

* cleanup

* cleanup
```
79712e7e
Run the integration tests on schedule tests instead of master tests · 985d7088
Lysandre authored Jun 01, 2021

985d7088

Neptune.ai integration (#11937) · 9996558b

Volodymyr Byno authored Jun 01, 2021

An option that turns on neptune.ai logging
--report_to 'neptune'

Additional ENV variables:
	NEPTUNE_PROJECT
	NEPTUNE_API_TOKEN
	NEPTUNE_RUN_NAME (optional)
	NEPTUNE_STOP_TIMEOUT (optional)

9996558b

Authorize args when instantiating an AutoModel (#11956) · ae6ce28f
Lysandre Debut authored Jun 01, 2021

ae6ce28f

Add regression tests for slow sentencepiece tokenizers. (#11737) · fcad8018

Philip May authored Jun 01, 2021

* add test_vocab_size for sentencepiece tok.

* add test_get_vocab for sentencepiece tok.

* add test_convert_token_and_id for sentencepiece tok.

* add test_tokenize_and_convert_tokens_to_string for all tok.

* improve test_tokenize_and_convert_tokens_to_string for sp. tok.

* add common tokenizer integration tests
- for albert
- for barthez

* add tokenizer integration tests to bert gen.

* add most tokenizer integration tests

* fix camembert tokenizer integration test

* add tokenizer integration test to marian

* add tokenizer integration test to reformer

* add typing and doc to tokenizer_integration_test_util

* fix tokenizer integration test of reformer

* improve test_sentencepiece_tokenize_and_convert_tokens_to_string

* empty commit to trigger CI

* fix tokenizer integration test of reformer

* remove code not needed anymore

* empty commit to trigger CI

* empty commit to trigger CI

fcad8018

reinitialize wandb config for each hyperparameter search run (#11945) · c3d958b2
Josh Tanner authored Jun 01, 2021

c3d958b2

bugfixes training_args.py (#11922) · 99dbbdb9

Riccardo Bassani authored Jun 01, 2021

modified according to:
https://pytorch.org/xla/release/1.8.1/_modules/torch_xla/core/xla_model.html

99dbbdb9

modify qa-trainer (#11872) · 7e73601f
Fan Zhang authored Jun 01, 2021
```
* modify qa-trainer

* fix flax model
```
7e73601f

RAG-2nd2end-revamp (#11893) · 9ec0f01b

Shamane Siri authored Jun 01, 2021



* initial

* code quality test

* code quality

* added test functions in test_modeling_rag.py and test_retrieval_rag.py to test end2end retreiver

* minor change in test_modeling_rag

* fixed tests

* Update examples/research_projects/rag-end2end-retriever/README.md

typo corrected as suggested by lhoestq
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

* Update examples/research_projects/rag-end2end-retriever/finetune_rag.py

type change suggested by lhoestq
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

* Update src/transformers/models/rag/retrieval_rag.py

Adding this change as mentioned by lhoestq.
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

* completed the minor changes suggested by the reviewers
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

9ec0f01b

Add FlaxCLIP (#11883) · ad25fd62

Suraj Patil authored Jun 01, 2021

* add flax CLIP

* default input_shape

* add tests

* fix test

* fix name

* fix docs

* fix shapes

* attend at least 1 token

* flax conv to torch conv

* return floats

* fix equivalence tests

* fix import

* return attention_weights and update tests

* fix dosctrings

* address patricks comments

* input_shape arg

* add tests for get_image_features and get_text_features methods

* fix tests

ad25fd62

31 May, 2021 4 commits
- Add MT5ForConditionalGeneration as supported arch. to summarization README (#11961) · cfca638a
  Philip May authored May 31, 2021
```
* Add MT5ForConditionalGeneration as supported arch.

* Update README.md
```
  cfca638a
- Remove redundant `nn.log_softmax` in `run_flax_glue.py` (#11920) · 1ab147d6
  Nicholas Vadivelu authored May 31, 2021
```
* Remove redundant `nn.log_softmax` in `run_flax_glue.py`

`optax.softmax_cross_entropy` expects unnormalized logits, and so it already calls `nn.log_softmax`, so I believe it is not needed here. `nn.log_softmax` is idempotent so mathematically it shouldn't have made a difference.

* Remove unused 'flax.linen' import
```
  1ab147d6
- fix assert (#11935) · fb60c309
  Philip May authored May 31, 2021
  
  fb60c309
- Remove `datasets` submodule · 04a9709c
  Lysandre authored May 31, 2021
  
  04a9709c
28 May, 2021 3 commits

Test optuna and ray (#11924) · 8d171628
Lysandre Debut authored May 28, 2021

8d171628

[Flax] Return Attention from BERT, ELECTRA, RoBERTa and GPT2 (#11918) · af1a10bf

Jayendra authored May 28, 2021



* Added logic to return attention from flax-bert model and added test cases to check that

* Added new line at the end of file to test_modeling_flax_common.py

* fixing code style

* Fixing Roberta and Elextra models too from cpoying bert

* Added temporary hack to not run test_attention_outputs for FlaxGPT2

* Returning attention weights from GPT2 and changed the tests accordingly.

* last fixes

* bump flax dependency
Co-authored-by: jayendra <jayendra@infocusp.in>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

af1a10bf

Added Sequence Classification class in GPTNeo (#11906) · e1205e47
Bhadresh Savani authored May 28, 2021
```
* seq classification changes

* fix tests
```
e1205e47

27 May, 2021 3 commits

Adding new argument `max_new_tokens` for generate. (#11476) · 80d712fa

Nicolas Patry authored May 27, 2021

* Adding new argument `max_new_tokens` for generate.

This is a proposal to add a new argument `max_new_tokens` to `generate`.
This include a `MaxNewTokensCriteria` that enables callers that don't
know about the token length ahead (like pipelines callers) to manage
more easily the length of their generated output.

* Adding a test for the user warning when both`max_length` and
`max_new_tokens` are used together.

* Removed redundant `no_grad`.

80d712fa

Update deepspeed config to reflect hyperparameter search parameters (#11896) · 2dd6fb25
Josh Tanner authored May 27, 2021
```
* rebuild deepspeed config for hyperparameter search

* reformat code to fix style issues
```
2dd6fb25
Add Emotion Speech Noteboook (#11900) · 42fe0dc2
Patrick von Platen authored May 27, 2021

42fe0dc2

26 May, 2021 7 commits

Flax Generate (#11777) · 996a315e

Patrick von Platen authored May 27, 2021



* fix_torch_device_generate_test

* remove @

* add

* indexing

* correct a couple of tests

* fix tests

* add logits processor

* finish top_k, top_p, temp

* add docs

* correct flax prng key default

* improve generate

* add generation docs

* add docs

* make style

* revert model outputs change

* make style

* correct typo

* fix tests

* fix slow test

* add raise

* finish generation
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

996a315e

Link official Cloud TPU JAX docs (#11892) · 2df54691
Avital Oliver authored May 26, 2021

2df54691

changing find_batch_size to work with tokenizer outputs (#11890) · 1530384e

joerenner authored May 26, 2021



* changing find_batch_size to work with tokenizer outputs

trainer_pt_utils.find_batch_size does not recognize the batch size of BatchEncoding objects. This can cause an error when a trainer relies on find_batch_size to report the number of observed examples in the evaluation loop.

* Trigger CI
Co-authored-by: jrenner <joseph.renner@inria.fr>

1530384e

[Flax] Allow dataclasses to be jitted (#11886) · d5a72b6e

Patrick von Platen authored May 26, 2021

* fix_torch_device_generate_test

* remove @

* change dataclasses to flax ones

* fix typo

* fix jitted tests

* fix bert & electra

d5a72b6e

Correcting comments in T5Stack to reflect correct tuple order (#11330) · e6126e19

talkhaldi authored May 26, 2021



* Correcting comments to reflect correct tuple order

In order to match the actual order (line 513 and 516, and as accessed in 968), I've changed the order mentioned in comments L962 and L966-967.

* Update modeling_t5.py

Updating another comment as well

* Removing extra space

* Fixing style and quality

* style & quality

* Update src/transformers/models/t5/modeling_t5.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

e6126e19

Fix usage of head masks by TF encoder-decoder models' `generate()` function (#11775) · 0b933584

Daniel Stancl authored May 26, 2021

* Fix Bart

* Fix Blenderbot{,_small}

* Fix LED

* Fix Marian

* Fix MBart

* Fix Pegasus

* Fix T5

* Add test for generation with head_mask

* Add a common TF test

* Override a test for the LED model as head masking is not yet properly implemented

* Remove all head_masks from input preparation for LED

* Drop masking for T5 as it needs a bit of refactor

0b933584

Ensure input tensor are on device. (#11874) · 0b0a5984

francescorubbo authored May 26, 2021

The feature extractor does not create tensors on the appropriate device,
so we call `ensure_tensor_on_device` before feeding the processed inputs
to the model.

0b0a5984

25 May, 2021 9 commits
- [Wav2Vec2ForCTC] example typo fixed (#11878) · a9c797f9
  Ahmet Akkoç authored May 26, 2021
  
  a9c797f9
- [Examples] create model with custom config on the fly (#11798) · 1b653010
  Stas Bekman authored May 25, 2021
```
* create custom model on the flight

* better wording

* add update_from_string

* cleanup

* cleanup

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more bool options

* style

* fix logger

* add test

* add the doc

* assert on conflict of options
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
  1b653010
- [lm examples] fix overflow in perplexity calc (#11855) · 6287c929
  Stas Bekman authored May 25, 2021
```
* fix overflow in perplexity calc

* use inf

* fix
```
  6287c929
- [Wav2Vec2] SpecAugment Fast (#11764) · 7630c11f
  Patrick von Platen authored May 25, 2021
```
* first try

* finish
```
  7630c11f
- Add option to log only once in multinode training (#11819) · f086652b
  Sylvain Gugger authored May 25, 2021
```
* Add option to long only once in multinode training

* Use an alternate property
```
  f086652b
- typo (#11858) · b8344a27
  Wang Ran (汪然) authored May 25, 2021
  
  b8344a27
- fixed a small typo in the doc (#11856) · f9880f62
  Shiro T authored May 25, 2021
  
  f9880f62
- Enable memory metrics in tests that need it (#11859) · 6da129cb
  Lysandre Debut authored May 25, 2021
  
  6da129cb
- Add some tests to the slow suite #11860 · db0b2477
  Lysandre Debut authored May 25, 2021
  
  db0b2477
24 May, 2021 2 commits

[Trainer] Report both steps and num samples per second (#11818) · afe479ad

Sylvain Gugger authored May 24, 2021



* [Trainer] Report both steps and num samples per second

* Fix batch number

* Update src/transformers/trainer_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

afe479ad

Fix two typos in docs (#11852) · eaab9397
Nick Lane-Smith authored May 24, 2021
```
* typo2

* fix typo
```
eaab9397