Commits · 763ece2feadf51fc5a26c5acbcdf49bab119f3bd · chenpangpang / transformers

27 Jan, 2021 14 commits

Fix model templates (#9842) · 763ece2f
Lysandre Debut authored Jan 27, 2021

763ece2f
Fix template (#9840) · bd701ab1
Julien Plu authored Jan 27, 2021

bd701ab1

Add a flag for find_unused_parameters (#9820) · c7b7bd99

Sylvain Gugger authored Jan 27, 2021



* Add a flag for find_unused_parameters

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Remove negation
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

c7b7bd99

Clean TF Bert (#9788) · 4adbdce5

Julien Plu authored Jan 27, 2021

* Start cleaning BERT

* Clean BERT and all those depends of it

* Fix attribute name

* Apply style

* Apply Sylvain's comments

* Apply Lysandre's comments

* remove unused import

4adbdce5

Delete a needless duplicate condition (#9826) · f0329ea5
tomohideshibata authored Jan 27, 2021
```
Co-authored-by: Tomohide Shibata <tomshiba@yahoo-corp.jp>
```
f0329ea5

Remove a TF usage warning and rework the documentation (#9756) · a1720694

Julien Plu authored Jan 27, 2021

* Rework documentation

* Update the template

* Trigger CI

* Restore the warning but with the TF logger

* Update convbert doc

a1720694

Adding a test to prevent late failure in the Table question answering (#9808) · 285c6262

Nicolas Patry authored Jan 27, 2021

pipeline.

- If table is empty then the line that contain `answer[0]` will fail.
- This PR add a check to prevent `answer[0]`.
- Also adds an early check for presence of `table` and `query` to
prevent late failure and give better error message.
- Adds a few tests to make sure these errors are correctly raised.

285c6262

fix typo with mt5 init (#9830) · a46050d0
Patrick von Platen authored Jan 27, 2021

a46050d0
Fix auto-resume training from checkpoint (#9822) · f4bf0dea
jncasey authored Jan 27, 2021
```
* Fix auto-resume training from checkpoint

* style fixes
```
f4bf0dea
Setup logging with a stdout handler (#9816) · f2fabedb
Sylvain Gugger authored Jan 27, 2021

f2fabedb
Add a test for mixed precision (#9806) · 2c891c15
Julien Plu authored Jan 27, 2021

2c891c15
[Setup.py] update jaxlib (#9831) · d5b40d66
Patrick von Platen authored Jan 27, 2021
```
* update jaxlib

* Update setup.py

* update table
```
d5b40d66

ConvBERT Model (#9717) · f617490e

abhishek thakur authored Jan 27, 2021

* finalize convbert

* finalize convbert

* fix

* fix

* fix

* push

* fix

* tf image patches

* fix torch model

* tf tests

* conversion

* everything aligned

* remove print

* tf tests

* fix tf

* make tf tests pass

* everything works

* fix init

* fix

* special treatment for sepconv1d

* style

* 🙏🏽



* add doc and cleanup

* add electra test again

* fix doc

* fix doc again

* fix doc again

* Update src/transformers/modeling_tf_pytorch_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/conv_bert/configuration_conv_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update docs/source/model_doc/conv_bert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/conv_bert/configuration_conv_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* conv_bert -> convbert

* more fixes from review

* add conversion script

* dont use pretrained embed

* unused config

* suggestions from julien

* some more fixes

* p -> param

* fix copyright

* fix doc

* Update src/transformers/models/convbert/configuration_convbert.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* comments from reviews

* fix-copies

* fix style

* revert shape_list
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

f617490e

fix led not defined (#9828) · e575e062
Patrick von Platen authored Jan 27, 2021

e575e062

26 Jan, 2021 13 commits

Fix a bug in run_glue.py (#9812) (#9815) · 059bb258
Yusuke Mori authored Jan 27, 2021

059bb258

Commit the last step on world_process_zero in WandbCallback (#9805) · eba418ac

Tristan Deleu authored Jan 26, 2021

* Commit the last step on world_process_zero in WandbCallback

* Use the environment variable WANDB_LOG_MODEL as a default value in WandbCallback

eba418ac

Allow RAG to output decoder cross-attentions (#9789) · 8edc98bb

Derrick Blakely authored Jan 26, 2021



* get cross attns

* add cross-attns doc strings

* fix typo

* line length

* Apply suggestions from code review
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

8edc98bb

Fix fine-tuning translation scripts (#9809) · 8f6c12d3
Magdalena Biesialska authored Jan 26, 2021

8f6c12d3
Fixed parameter name for logits_processor (#9790) · c37dcff7
Michael Glass authored Jan 26, 2021

c37dcff7

Smdistributed trainer (#9798) · 0d0efd3a

Sylvain Gugger authored Jan 26, 2021

* Add a debug print

* Adapt Trainer to use smdistributed if available

* Forgotten parenthesis

* Real check for sagemaker

* Donforget to define device...

* Woopsie, local)rank is defined differently

* Update since local_rank has the proper value

* Remove debug statement

* More robust check for smdistributed

* Quality

* Deal with key not present error

0d0efd3a

Fix head_mask for model templates · 897a24c8
Lysandre authored Jan 26, 2021

897a24c8

Improve pytorch examples for fp16 (#9796) · 10e5f282

Andrea Cappelli authored Jan 26, 2021



* Pad to 8x for fp16 multiple choice example (#9752)

* Pad to 8x for fp16 squad trainer example (#9752)

* Pad to 8x for fp16 ner example (#9752)

* Pad to 8x for fp16 swag example (#9752)

* Pad to 8x for fp16 qa beam search example (#9752)

* Pad to 8x for fp16 qa example (#9752)

* Pad to 8x for fp16 seq2seq example (#9752)

* Pad to 8x for fp16 glue example (#9752)

* Pad to 8x for fp16 new ner example (#9752)

* update script template #9752

* Update examples/multiple-choice/run_swag.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/question-answering/run_qa.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/question-answering/run_qa_beam_search.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve code quality #9752
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

10e5f282

Adding `skip_special_tokens=True` to FillMaskPipeline (#9783) · 781e4b13

Nicolas Patry authored Jan 26, 2021

* We most likely don't want special tokens in this output.

* Adding `skip_special_tokens=True` to FillMaskPipeline

- It's backward incompatible.
- It makes for sense for pipelines to remove references to
special_tokens (all of the other pipelines do that).
- Keeping special tokens makes it hard for users to actually remove them
  because all models have different tokens (<s>, <cls>, [CLS], ....)

* Fixing `token_str` in the same vein, and actually fix the tests too !

781e4b13

Add head_mask/decoder_head_mask for TF BART models (#9639) · 1867d9a8

Daniel Stancl authored Jan 26, 2021

* Add head_mask/decoder_head_mask for TF BART models

* Add head_mask and decoder_head_mask input arguments for TF BART-based
models as a TF counterpart to the PR #9569

* Add test_headmasking functionality to tests/test_modeling_tf_common.py

* TODO: Add a test to verify that we can get a gradient back for
importance score computation

* Remove redundant #TODO note

Remove redundant #TODO note from tests/test_modeling_tf_common.py

* Fix assertions

* Make style

* Fix ...Model input args and adjust one new test

* Add back head_mask and decoder_head_mask to BART-based ...Model
after the last commit

* Remove head_mask ande decoder_head_mask from input_dict
in TF test_train_pipeline_custom_model as these two have different
shape than other input args (Necessary for passing this test)

* Revert adding global_rng in test_modeling_tf_common.py

1867d9a8

Fix broken links in the converting tf ckpt document (#9791) · cb73ab5a

Yusuke Mori authored Jan 26, 2021



* Fix broken links in the converting tf ckpt document

* Update docs/source/converting_tensorflow_models.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Reflect the review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

cb73ab5a

[Flaky Generation Tests] Make sure that no early stopping is happening for beam search (#9794) · d94cc2f9
Patrick von Platen authored Jan 26, 2021
```
* fix ci

* fix ci

* renaming

* fix dup line
```
d94cc2f9

[PR/Issue templates] normalize, group, sort + add myself for deepspeed (#9706) · 0fdbf085

Stas Bekman authored Jan 25, 2021



* normalize, group, sort + add myself for deepspeed

* new structure

* add ray

* typo

* more suggestions

* more suggestions

* white space

* Update .github/ISSUE_TEMPLATE/bug-report.md
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* add bullets

* sync

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* sync
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

0fdbf085

25 Jan, 2021 7 commits
- Fix style · af41da50
  Sylvain Gugger authored Jan 25, 2021
  
  af41da50
- Auto-resume training from checkpoint (#9776) · caf4abf7
  Sylvain Gugger authored Jan 25, 2021
```
* Auto-resume training from checkpoint

* Update examples/text-classification/run_glue.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Roll out to other examples
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
```
  caf4abf7
- Actual fix (#9787) · 0f443436
  Lysandre Debut authored Jan 25, 2021
  
  0f443436
- [fsmt] onnx triu workaround (#9738) · fac7cfb1
  Stas Bekman authored Jan 25, 2021
```
* onnx triu workaround

* style

* working this time

* add test

* more efficient version
```
  fac7cfb1
- Fix a typo in Trainer.hyperparameter_search docstring (#9762) · 626116b7
  Sorami Hisamoto authored Jan 25, 2021
```
`compute_objectie` => `compute_objective`
```
  626116b7
- Use object store to pass trainer object to Ray Tune (#9749) · d63ab615
  Kai Fricke authored Jan 25, 2021
  
  d63ab615
- Fix TFTrainer prediction output (#9662) · 6312fed4
  Maria Janina Sarol authored Jan 25, 2021
```
* Fix TFTrainer prediction output

* Update trainer_tf.py

* Fix TFTrainer prediction output

* Fix evaluation_loss update in TFTrainer

* Fix TFTrainer prediction output
```
  6312fed4
23 Jan, 2021 2 commits
- Fix broken [Open in Colab] links (#9761) · 9152f160
  Wilfried L. Bounsi authored Jan 23, 2021
  
  9152f160
- token_type_ids isn't used (#9736) · b7b7e5d0
  Stas Bekman authored Jan 22, 2021
  
  b7b7e5d0
22 Jan, 2021 4 commits
- Fix test (#9755) · a449ffcb
  Julien Plu authored Jan 22, 2021
  
  a449ffcb
- Add `report_to` training arguments to control the reporting integrations used (#9735) · 82d46feb
  Sylvain Gugger authored Jan 22, 2021
  
  82d46feb
- Fixes to run_seq2seq and instructions (#9734) · 411c5821
  Sylvain Gugger authored Jan 22, 2021
```
* Fixes to run_seq2seq and instructions

* Add more defaults for summarization
```
  411c5821
- Fix some TF slow tests (#9728) · d7c31abf
  Julien Plu authored Jan 22, 2021
```
* Fix saved model tests + fix a graph issue in longformer

* Apply style
```
  d7c31abf