Commits · 0270d44f5741d9f983d24dd2f9e7ef952341093a · chenpangpang / transformers

20 Oct, 2021 1 commit

Leandro von Werra authored Oct 20, 2021

* add `ContextManagers` for lists of contexts

* fix import sorting

* add `ContextManagers` tests

0270d44f

19 Oct, 2021 1 commit

TF Model train and eval step metrics for seq2seq models. (#14009) · 122c2f81

Pedro Marques authored Oct 19, 2021



* TF Model train and eval step metrics for seq2seq models.

When using a model with a seq2seq output compute metrics against logits.

* Removing vestigial code
Co-authored-by: matt <rocketknight1@gmail.com>

122c2f81

18 Oct, 2021 6 commits

[Speech] Refactor Examples (#14040) · d5ff69fc
Patrick von Platen authored Oct 18, 2021
```
* adapt_examples

* up

* up

* up

* up

* add auto models

* finish
```
d5ff69fc

Add an API to register objects to Auto classes (#13989) · 2c60ff2f

Sylvain Gugger authored Oct 18, 2021



* Add API to register a new object in auto classes

* Fix test

* Documentation

* Add to tokenizers and test

* Add cleanup after tests

* Be more careful

* Move import

* Move import

* Cleanup in TF test too

* Add consistency check

* Add documentation

* Style

* Update docs/source/model_doc/auto.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/auto/auto_factory.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

2c60ff2f

Add BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese (#13788) · 3d587c53

Dat Quoc Nguyen authored Oct 18, 2021



* Add the pre-trained BARTpho model

* Add the pre-trained BARTpho model

* Add the pre-trained BARTpho model

* Fix incorrectly sorted and/or formatted imports

* Fix incorrectly sorted and/or formatted style

* Fix check_dummies

* Fix check_dummies

* Fix check_dummies

* Update docs/source/model_doc/bartpho.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/bartpho/__init__.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/bartpho/tokenization_bartpho.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update tests/test_tokenization_bartpho.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/bartpho/tokenization_bartpho.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update tests/test_tokenization_bartpho.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/bartpho.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/bartpho.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/bartpho/__init__.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Add the pre-trained BARTpho model

* Add Tips section in doc and details of monolingual_vocab_file

* Fix conflicts

* Add another tip related to monolingual_vocab_file

* Readd dependency_versions_table.py

* Handle failing checks

* Remove test_list.txt

* Remove md5sum.saved

* Revise Readme.md
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3d587c53

up (#14046) · 7c6cd0ac
Patrick von Platen authored Oct 18, 2021

7c6cd0ac
Update SEW integration test tolerance (#14048) · 82b62fa6
Anton Lozhkov authored Oct 18, 2021

82b62fa6
[Speech] Move all examples to new audio feature (#14045) · bdf31d6e
Patrick von Platen authored Oct 18, 2021
```
* up

* up

* up

* finish
```
bdf31d6e

16 Oct, 2021 1 commit
- minor fixes (#14026) · 84ad6af4
  Suraj Patil authored Oct 16, 2021
  
  84ad6af4
15 Oct, 2021 1 commit

Add the SEW and SEW-D speech models (#13962) · cd3166a8

Anton Lozhkov authored Oct 15, 2021



* Working encoder

* SEW-D and tests

* Further conv fixes

* Automodels and conv inits

* Update integration tests, add docs

* Docs cleanup, resolve todos

* Conf fix

* Fix docs

* Fix tests, apply suggestions

* Update src/transformers/models/sew/modeling_sew.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Model conversion and updated no-mask tests

* Remove copy of feature_proj

* Style

* Update src/transformers/models/auto/feature_extraction_auto.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/auto/feature_extraction_auto.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Move orgs
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

cd3166a8

14 Oct, 2021 5 commits
- Scatter dummies + skip pipeline tests (#13996) · 5b317f7e
  Lysandre Debut authored Oct 14, 2021
```
* Scatter dummies + skip pipeline tests

* Add torch scatter to build docs
```
  5b317f7e
- up (#14008) · 7fb2a8b3
  Patrick von Platen authored Oct 14, 2021
  
  7fb2a8b3
- Fix FNet tokenizer tests (#13995) · 7604557e
  Lysandre Debut authored Oct 14, 2021
  
  7604557e
- Add strong test for configuration attributes (#14000) · f2002fea
  Sylvain Gugger authored Oct 14, 2021
```
* Add strong test for configuration attributes

* Add fake modif to trigger all tests

* Add a better fake modif

* Ignore is_encoder_decoder

* Fix faulty configs

* Remove fake modif
```
  f2002fea
- up (#13988) · cc360649
  Patrick von Platen authored Oct 14, 2021
  
  cc360649
13 Oct, 2021 1 commit

Add TrOCR + VisionEncoderDecoderModel (#13874) · 408b2d2b

NielsRogge authored Oct 13, 2021

* First draft

* Update self-attention of RoBERTa as proposition

* Improve conversion script

* Add TrOCR decoder-only model

* More improvements

* Make forward pass with pretrained weights work

* More improvements

* Some more improvements

* More improvements

* Make conversion work

* Clean up print statements

* Add documentation, processor

* Add test files

* Small improvements

* Some more improvements

* Make fix-copies, improve docs

* Make all vision encoder decoder model tests pass

* Make conversion script support other models

* Update URL for OCR image

* Update conversion script

* Fix style & quality

* Add support for the large-printed model

* Fix some issues

* Add print statement for debugging

* Add print statements for debugging

* Make possible fix for sinusoidal embedding

* Further debugging

* Potential fix v2

* Add more print statements for debugging

* Add more print statements for debugging

* Deubg more

* Comment out print statements

* Make conversion of large printed model possible, address review comments

* Make it possible to convert the stage1 checkpoints

* Clean up code, apply suggestions from code review

* Apply suggestions from code review, use Microsoft models in tests

* Rename encoder_hidden_size to cross_attention_hidden_size

* Improve docs

408b2d2b

12 Oct, 2021 3 commits

Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) · 8b240a06

Yih-Dar authored Oct 13, 2021



* Add cross attentions to TFGPT2Model

* Add TFEncoderDecoderModel

* Add TFBaseModelOutputWithPoolingAndCrossAttentions

* Add cross attentions to TFBertModel

* Fix past or past_key_values argument issue

* Fix generation

* Fix save and load

* Add some checks and comments

* Clean the code that deals with past keys/values

* Add kwargs to processing_inputs

* Add serving_output to TFEncoderDecoderModel

* Some cleaning + fix use_cache value issue

* Fix tests + add bert2bert/bert2gpt2 tests

* Fix more tests

* Ignore crossattention.bias when loading GPT2 weights into TFGPT2

* Fix return_dict_in_generate in tf generation

* Fix is_token_logit_eos_token bug in tf generation

* Finalize the tests after fixing some bugs

* Fix another is_token_logit_eos_token bug in tf generation

* Add/Update docs

* Add TFBertEncoderDecoderModelTest

* Clean test script

* Add TFEncoderDecoderModel to the library

* Add cross attentions to TFRobertaModel

* Add TFRobertaEncoderDecoderModelTest

* make style

* Change the way of position_ids computation

* bug fix

* Fix copies in tf_albert

* Remove some copied from and apply some fix-copies

* Remove some copied

* Add cross attentions to some other TF models

* Remove encoder_hidden_states from TFLayoutLMModel.call for now

* Make style

* Fix TFRemBertForCausalLM

* Revert the change to longformer + Remove copies

* Revert the change to albert and convbert + Remove copies

* make quality

* make style

* Add TFRembertEncoderDecoderModelTest

* make quality and fix-copies

* test TFRobertaForCausalLM

* Fixes for failed tests

* Fixes for failed tests

* fix more tests

* Fixes for failed tests

* Fix Auto mapping order

* Fix TFRemBertEncoder return value

* fix tf_rembert

* Check copies are OK

* Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined

* Add TFEncoderDecoderModelSaveLoadTests

* fix tf weight loading

* check the change of use_cache

* Revert the change

* Add missing test_for_causal_lm for TFRobertaModelTest

* Try cleaning past

* fix _reorder_cache

* Revert some files to original versions

* Keep as many copies as possible

* Apply suggested changes - Use raise ValueError instead of assert

* Move import to top

* Fix wrong require_torch

* Replace more assert by raise ValueError

* Add test_pt_tf_model_equivalence (the test won't pass for now)

* add test for loading/saving

* finish

* finish

* Remove test_pt_tf_model_equivalence

* Update tf modeling template

* Remove pooling, added in the prev. commit, from MainLayer

* Update tf modeling test template

* Move inputs["use_cache"] = False to modeling_tf_utils.py

* Fix torch.Tensor in the comment

* fix use_cache

* Fix missing use_cache in ElectraConfig

* Add a note to from_pretrained

* Fix style

* Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt

* Fix TFMLP (in TFGPT2) activation issue

* Fix None past_key_values value in serving_output

* Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub

* Apply review suggestions - style for cross_attns in serving_output

* Apply review suggestions - change assert + docstrings

* break the error message to respect the char limit

* deprecate the argument past

* fix docstring style

* Update the encoder-decoder rst file

* fix Unknown interpreted text role "method"

* fix typo
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

8b240a06

[Wav2Vec2] Make sure tensors are always bool for mask_indices (#13977) · 58bf8825
Patrick von Platen authored Oct 12, 2021
```
* correct long to bool

* up

* correct code
```
58bf8825
Specify im-seg mask greyscole mode (#13974) · 11c043d2
Mishig Davaadorj authored Oct 12, 2021

11c043d2

11 Oct, 2021 4 commits

[Speech Examples] Add pytorch speech pretraining (#13877) · d45fc7da

Patrick von Platen authored Oct 12, 2021

* adapt wav2vec2

* add example

* add files

* adapt

* remove bogus file

* Apply suggestions from code review

* adapt files more

* upload changes

* del old files

* up

* up

* up

* up

* up

* correct gradient checkpoitning

* add readme

* finish

* finish

* up

* more fixes

* up

* up

* add demo run to readme

* up

d45fc7da

Replace assert with unittest assertions (#13957) · e1bb2ebd
Luis F. Talavera R authored Oct 11, 2021

e1bb2ebd

[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer... · dca67968

Patrick von Platen authored Oct 11, 2021

[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer when gradient checkpointing is enabled (#13961)

* up

* correct test

dca67968

Honor existing attention mask in tokenzier.pad (#13926) · 4a18337b

Sylvain Gugger authored Oct 11, 2021

* Honor existing attention mask in tokenzier.pad

* Fix initialization of attention mask

* Roll the implem on all subclasses

* Fix tests

4a18337b

08 Oct, 2021 3 commits

[Generation] Fix max_new_tokens (#13919) · c8b07612
Patrick von Platen authored Oct 08, 2021
```
* up

* Update src/transformers/generation_stopping_criteria.py

* finish
```
c8b07612
Adding support for tokens being suffixes or part of each other. (#13918) · d70919e6
Nicolas Patry authored Oct 08, 2021
```
* Adding support for tokens being suffixes or part of each other.

* Better test name.
```
d70919e6

Image Segmentation pipeline (#13828) · 026866df

Mishig Davaadorj authored Oct 08, 2021



* Implement img seg pipeline

* Update src/transformers/pipelines/image_segmentation.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/image_segmentation.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update output shape with individual masks

* Rm dev change

* Remove loops in test
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

026866df

07 Oct, 2021 2 commits
- Fix incorrect output shapes for TF/PT LED (#13882) · 61cf2ea9
  Matt authored Oct 07, 2021
```
* Fix issues with LED model

* Style pass

* Bugfixes

* correct attentions as well
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
```
  61cf2ea9
- [Wav2Vec2] Fix mask_feature_prob (#13921) · 0f5488f7
  Patrick von Platen authored Oct 07, 2021
```
* up

* overwrite hubert
```
  0f5488f7
06 Oct, 2021 2 commits
- Fixing Backward compatiblity for zero-shot (#13855) · 013bdc6d
  Nicolas Patry authored Oct 06, 2021
```
Fixes #13846
```
  013bdc6d
- Fixing GPU for token-classification in a better way. (#13856) · e7b16f33
  Nicolas Patry authored Oct 06, 2021
```
Co-authored-by: Pierre Snell <pierre.snell@botpress.com>
Co-authored-by: Pierre Snell <pierre.snell@botpress.com>
```
  e7b16f33
05 Oct, 2021 5 commits

Fixing question-answering with long contexts (#13873) · 0ddadbf0

Nicolas Patry authored Oct 05, 2021

* Tmp.

* Fixing BC for question answering with long context.

* Capping model_max_length to avoid tf overflow.

* Bad workaround bugged roberta.

* Fixing name.

0ddadbf0

Allow dataset to be an optional argument for (Distributed)LengthGroupedSampler (#13820) · 1b74af76
Zhaofeng Wu authored Oct 05, 2021
```
* Allow dataset to be an optional argument for (Distributed)LengthGroupedSampler

* Fix
```
1b74af76

Initial support for symbolic tracing with torch.fx allowing dynamic axes (#13579) · d4e4efce

Michael Benayoun authored Oct 05, 2021



* Symbolic trace dynamic axes support for BERT like models (albert, bert, distilbert, mobilebert, electra, megatron-bert)
* Sanity checks before tracing that make sure the model to trace is supported
* Adapted to PyTorch 1.9
Co-authored-by: Michael Benayoun <michael@huggingface.co>

d4e4efce

Fixing empty prompts for text-generation when BOS exists. (#13859) · 3a9c0f23

Nicolas Patry authored Oct 05, 2021

* Fixing empty prompts for text-generation when BOS exists.

* Fixing odd case with Pegasus.

* Fixing Bert is Assertion Error.

3a9c0f23

Fixing 1-length special tokens cut. (#13862) · 7079a99e
Nicolas Patry authored Oct 05, 2021

7079a99e

04 Oct, 2021 2 commits

Update no_* argument (HfArgumentParser) (#13865) · 12b4d66a

Bram Vanroy authored Oct 04, 2021

* update no_* argument

Changes the order so that the no_* argument is created after the original argument AND sets the default for this no_* argument to False

* import copy

* update test

* make style

* Use kwargs to set default=False

* make style

12b4d66a

Add Mistral GPT-2 Stability Tweaks (#13573) · 3a8de58c

Sidd Karamcheti authored Oct 04, 2021



* Add layer-wise scaling

* Add reorder & upcasting argument

* Add OpenAI GPT-2 weight initialization scheme

* start `layer_idx` count at zero for consistency

* disentangle attn and reordered and upscaled attn function

* rename `scale_attn_by_layer` to `scale_attn_by_layer_id`

* make autocast from amp compatible with pytorch<1.6

* fix docstring

* style fixes

* Add fixes from PR feedback, style tweaks

* Fix doc whitespace

* Reformat

* First pass scale_attn_by_layer_idx and reorder_and_upcast_attn tests

* Rename scale_attn_by_layer_idx, add tip

* Remove extra newline

* add test for weight initialization

* update code format

* add assert check weights are fp32

* remove assert

* Fix incorrect merge

* Fix shape mismatch in baddbmm

* Add generation test for Mistral flags
Co-authored-by: leandro <leandro.vonwerra@spoud.io>
Co-authored-by: Keshav Santhanam <keshav2@stanford.edu>
Co-authored-by: J38 <jebolton@stanford.edu>

3a8de58c

30 Sep, 2021 2 commits
- skip gptj slow generate tests for now (#13809) · 8bbb53e2
  Suraj Patil authored Oct 01, 2021
  
  8bbb53e2
- [DPR] Correct init (#13796) · 41436d3d
  Patrick von Platen authored Sep 30, 2021
```
* update

* add to docs and init

* make fix-copies
```
  41436d3d
29 Sep, 2021 1 commit
- Fix length of IterableDatasetShard and add test (#13792) · 63cc5bda
  Sylvain Gugger authored Sep 29, 2021
```
* Fix length of IterableDatasetShard and add test

* Add comments
```
  63cc5bda