Commits · d5ff69fce92bb1aab9273d674e762a8eddcb2e3f · chenpangpang / transformers

"vscode:/vscode.git/clone" did not exist on "9e147d31f67a03ea4f5b11a5c7c3b7f8d252bfb7"

18 Oct, 2021 8 commits

[Speech] Refactor Examples (#14040) · d5ff69fc
Patrick von Platen authored Oct 18, 2021
```
* adapt_examples

* up

* up

* up

* up

* add auto models

* finish
```
d5ff69fc
Fix save when laod_best_model_at_end=True (#14054) · 2024faf1
Sylvain Gugger authored Oct 18, 2021

2024faf1

Add an API to register objects to Auto classes (#13989) · 2c60ff2f

Sylvain Gugger authored Oct 18, 2021



* Add API to register a new object in auto classes

* Fix test

* Documentation

* Add to tokenizers and test

* Add cleanup after tests

* Be more careful

* Move import

* Move import

* Cleanup in TF test too

* Add consistency check

* Add documentation

* Style

* Update docs/source/model_doc/auto.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/auto/auto_factory.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

2c60ff2f

Add BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese (#13788) · 3d587c53

Dat Quoc Nguyen authored Oct 18, 2021



* Add the pre-trained BARTpho model

* Add the pre-trained BARTpho model

* Add the pre-trained BARTpho model

* Fix incorrectly sorted and/or formatted imports

* Fix incorrectly sorted and/or formatted style

* Fix check_dummies

* Fix check_dummies

* Fix check_dummies

* Update docs/source/model_doc/bartpho.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/bartpho/__init__.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/bartpho/tokenization_bartpho.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update tests/test_tokenization_bartpho.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/bartpho/tokenization_bartpho.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update tests/test_tokenization_bartpho.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/bartpho.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/bartpho.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/bartpho/__init__.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Add the pre-trained BARTpho model

* Add Tips section in doc and details of monolingual_vocab_file

* Fix conflicts

* Add another tip related to monolingual_vocab_file

* Readd dependency_versions_table.py

* Handle failing checks

* Remove test_list.txt

* Remove md5sum.saved

* Revise Readme.md
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3d587c53

up (#14046) · 7c6cd0ac
Patrick von Platen authored Oct 18, 2021

7c6cd0ac
Update SEW integration test tolerance (#14048) · 82b62fa6
Anton Lozhkov authored Oct 18, 2021

82b62fa6
[Speech] Move all examples to new audio feature (#14045) · bdf31d6e
Patrick von Platen authored Oct 18, 2021
```
* up

* up

* up

* finish
```
bdf31d6e
Fix typo (#14044) · 4334095c
Mishig Davaadorj authored Oct 18, 2021

4334095c

17 Oct, 2021 1 commit
- [Speech Examples] Add new audio feature (#14027) · 37c5759c
  Patrick von Platen authored Oct 17, 2021
```
* finish

* up

* finish all

* up
```
  37c5759c
16 Oct, 2021 3 commits
- Replace assertions with ValueError exceptions (#14018) · cde0c750
  David del Río Medina authored Oct 16, 2021
```
* Replace assertions with ValueError exceptions

* Change length check for a more explicit one
```
  cde0c750
- Don't duplicate the elements in dir (#14023) · 968ae57c
  Sylvain Gugger authored Oct 15, 2021
  
  968ae57c
- minor fixes (#14026) · 84ad6af4
  Suraj Patil authored Oct 16, 2021
  
  84ad6af4
15 Oct, 2021 4 commits

[Docs] More general docstrings (#14028) · f5af8736
Patrick von Platen authored Oct 16, 2021
```
* up

* finish

* up

* up

* finish
```
f5af8736
Fix: replace asserts statements with exception (#14029) · 47489a69
Murilo Gonçalves authored Oct 15, 2021

47489a69

Add the SEW and SEW-D speech models (#13962) · cd3166a8

Anton Lozhkov authored Oct 15, 2021



* Working encoder

* SEW-D and tests

* Further conv fixes

* Automodels and conv inits

* Update integration tests, add docs

* Docs cleanup, resolve todos

* Conf fix

* Fix docs

* Fix tests, apply suggestions

* Update src/transformers/models/sew/modeling_sew.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Model conversion and updated no-mask tests

* Remove copy of feature_proj

* Style

* Update src/transformers/models/auto/feature_extraction_auto.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/auto/feature_extraction_auto.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Move orgs
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

cd3166a8

Fixed horizon_length for PPLM (#13886) · d5b82bb7
jacksukk authored Oct 15, 2021
```
* fixed horizon_length

* fixed horizon_length

* fix style
```
d5b82bb7

14 Oct, 2021 10 commits
- Scatter dummies + skip pipeline tests (#13996) · 5b317f7e
  Lysandre Debut authored Oct 14, 2021
```
* Scatter dummies + skip pipeline tests

* Add torch scatter to build docs
```
  5b317f7e
- Raise exceptions instead of asserts in... · b65c3897
  Lukas Weiner authored Oct 14, 2021
```
Raise exceptions instead of asserts in src/transformers/models/bart/modeling_flax_[bart, marian, mbart, pegasus].py (#13939)

* Raise exceptions instead of asserts

* fix: fixed failing quality check with copies

* fix: fixed max line length

* rerun github ci, failed to install dependencies
```
  b65c3897
- up (#14008) · 7fb2a8b3
  Patrick von Platen authored Oct 14, 2021
  
  7fb2a8b3
- Fix FNet tokenizer tests (#13995) · 7604557e
  Lysandre Debut authored Oct 14, 2021
  
  7604557e
- Add strong test for configuration attributes (#14000) · f2002fea
  Sylvain Gugger authored Oct 14, 2021
```
* Add strong test for configuration attributes

* Add fake modif to trigger all tests

* Add a better fake modif

* Ignore is_encoder_decoder

* Fix faulty configs

* Remove fake modif
```
  f2002fea
- Revert "Skip faulty test" · 0ef61d39
  Sylvain Gugger authored Oct 13, 2021
```
This reverts commit 5b6bd4e7.
```
  0ef61d39
- Replace assertion with ValueError exception (#14006) · a5be9541
  David del Río Medina authored Oct 14, 2021
  
  a5be9541
- up (#13988) · cc360649
  Patrick von Platen authored Oct 14, 2021
  
  cc360649
- Skip faulty test · 5b6bd4e7
  Sylvain Gugger authored Oct 13, 2021
  
  5b6bd4e7
- Remove wrong model_args supplied (#13937) · 51ee20fc
  Li-Huai (Allan) Lin authored Oct 14, 2021
```
* Remove wrong model_args of config.from_pretrained

* Fix tf & flax
```
  51ee20fc
13 Oct, 2021 1 commit

Add TrOCR + VisionEncoderDecoderModel (#13874) · 408b2d2b

NielsRogge authored Oct 13, 2021

* First draft

* Update self-attention of RoBERTa as proposition

* Improve conversion script

* Add TrOCR decoder-only model

* More improvements

* Make forward pass with pretrained weights work

* More improvements

* Some more improvements

* More improvements

* Make conversion work

* Clean up print statements

* Add documentation, processor

* Add test files

* Small improvements

* Some more improvements

* Make fix-copies, improve docs

* Make all vision encoder decoder model tests pass

* Make conversion script support other models

* Update URL for OCR image

* Update conversion script

* Fix style & quality

* Add support for the large-printed model

* Fix some issues

* Add print statement for debugging

* Add print statements for debugging

* Make possible fix for sinusoidal embedding

* Further debugging

* Potential fix v2

* Add more print statements for debugging

* Add more print statements for debugging

* Deubg more

* Comment out print statements

* Make conversion of large printed model possible, address review comments

* Make it possible to convert the stage1 checkpoints

* Clean up code, apply suggestions from code review

* Apply suggestions from code review, use Microsoft models in tests

* Rename encoder_hidden_size to cross_attention_hidden_size

* Improve docs

408b2d2b

12 Oct, 2021 7 commits

[parallel doc] dealing with layers larger than one gpu (#13980) · 61f64262
Stas Bekman authored Oct 12, 2021

61f64262

Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) · 8b240a06

Yih-Dar authored Oct 13, 2021



* Add cross attentions to TFGPT2Model

* Add TFEncoderDecoderModel

* Add TFBaseModelOutputWithPoolingAndCrossAttentions

* Add cross attentions to TFBertModel

* Fix past or past_key_values argument issue

* Fix generation

* Fix save and load

* Add some checks and comments

* Clean the code that deals with past keys/values

* Add kwargs to processing_inputs

* Add serving_output to TFEncoderDecoderModel

* Some cleaning + fix use_cache value issue

* Fix tests + add bert2bert/bert2gpt2 tests

* Fix more tests

* Ignore crossattention.bias when loading GPT2 weights into TFGPT2

* Fix return_dict_in_generate in tf generation

* Fix is_token_logit_eos_token bug in tf generation

* Finalize the tests after fixing some bugs

* Fix another is_token_logit_eos_token bug in tf generation

* Add/Update docs

* Add TFBertEncoderDecoderModelTest

* Clean test script

* Add TFEncoderDecoderModel to the library

* Add cross attentions to TFRobertaModel

* Add TFRobertaEncoderDecoderModelTest

* make style

* Change the way of position_ids computation

* bug fix

* Fix copies in tf_albert

* Remove some copied from and apply some fix-copies

* Remove some copied

* Add cross attentions to some other TF models

* Remove encoder_hidden_states from TFLayoutLMModel.call for now

* Make style

* Fix TFRemBertForCausalLM

* Revert the change to longformer + Remove copies

* Revert the change to albert and convbert + Remove copies

* make quality

* make style

* Add TFRembertEncoderDecoderModelTest

* make quality and fix-copies

* test TFRobertaForCausalLM

* Fixes for failed tests

* Fixes for failed tests

* fix more tests

* Fixes for failed tests

* Fix Auto mapping order

* Fix TFRemBertEncoder return value

* fix tf_rembert

* Check copies are OK

* Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined

* Add TFEncoderDecoderModelSaveLoadTests

* fix tf weight loading

* check the change of use_cache

* Revert the change

* Add missing test_for_causal_lm for TFRobertaModelTest

* Try cleaning past

* fix _reorder_cache

* Revert some files to original versions

* Keep as many copies as possible

* Apply suggested changes - Use raise ValueError instead of assert

* Move import to top

* Fix wrong require_torch

* Replace more assert by raise ValueError

* Add test_pt_tf_model_equivalence (the test won't pass for now)

* add test for loading/saving

* finish

* finish

* Remove test_pt_tf_model_equivalence

* Update tf modeling template

* Remove pooling, added in the prev. commit, from MainLayer

* Update tf modeling test template

* Move inputs["use_cache"] = False to modeling_tf_utils.py

* Fix torch.Tensor in the comment

* fix use_cache

* Fix missing use_cache in ElectraConfig

* Add a note to from_pretrained

* Fix style

* Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt

* Fix TFMLP (in TFGPT2) activation issue

* Fix None past_key_values value in serving_output

* Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub

* Apply review suggestions - style for cross_attns in serving_output

* Apply review suggestions - change assert + docstrings

* break the error message to respect the char limit

* deprecate the argument past

* fix docstring style

* Update the encoder-decoder rst file

* fix Unknown interpreted text role "method"

* fix typo
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

8b240a06

Fixing the lecture values by making sure defaults are not changed (#13976) · 26b6ef79
Nicolas Patry authored Oct 12, 2021
```
384 // 4 < 128 would break `doc_stride`.
```
26b6ef79
[Wav2Vec2] Make sure tensors are always bool for mask_indices (#13977) · 58bf8825
Patrick von Platen authored Oct 12, 2021
```
* correct long to bool

* up

* correct code
```
58bf8825
Specify im-seg mask greyscole mode (#13974) · 11c043d2
Mishig Davaadorj authored Oct 12, 2021

11c043d2
Fix missing tpu variable in benchmark_args_tf.py (#13968) · 85d69a7d
Hardian Lawi authored Oct 12, 2021

85d69a7d
Remove pip 21.3 from installation candidates for model templates · 990de2c1
Lysandre Debut authored Oct 11, 2021

990de2c1

11 Oct, 2021 6 commits

[Speech Examples] Add pytorch speech pretraining (#13877) · d45fc7da

Patrick von Platen authored Oct 12, 2021

* adapt wav2vec2

* add example

* add files

* adapt

* remove bogus file

* Apply suggestions from code review

* adapt files more

* upload changes

* del old files

* up

* up

* up

* up

* up

* correct gradient checkpoitning

* add readme

* finish

* finish

* up

* more fixes

* up

* up

* add demo run to readme

* up

d45fc7da

Replace assert by ValueError of... · 3499728d

Lahfa Samy authored Oct 11, 2021


Replace assert by ValueError of src/transformers/models/electra/modeling_{electra,tf_electra}.py and all other models that had copies (#13955)

* Replace all assert by ValueError in src/transformers/models/electra

* Reformat with black to pass check_code_quality test

* Change some assert to ValueError of modeling_bert & modeling_tf_albert

* Change some assert in multiples models

* Change multiples models assertion to ValueError in order to validate
  check_code_style test and models template test.

* Black reformat

* Change some more asserts in multiples models

* Change assert to ValueError in modeling_layoutlm.py to fix copy error in code_style_check

* Add proper message to ValueError in modeling_tf_albert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in models/bert/modeling_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add ValueError message to models/convbert/modeling_tf_convbert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add error message for ValueError to modeling_tf_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in models/tapas/modeling_tapas.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in models/electra/modeling_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add ValueError message in src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in src/transformers/models/rembert/modeling_rembert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Simplify logic in src/transformers/models/albert/modeling_albert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3499728d

Raise exceptions instead of asserts (#13938) · 64743d0a
Lukas Weiner authored Oct 11, 2021

64743d0a
Make username optional in hub_model_id (#13940) · 32634bce
Sylvain Gugger authored Oct 11, 2021

32634bce
Raise exceptions instead of asserts in xnli.py (#13945) · 708ffff6
Midhun R Nair authored Oct 11, 2021

708ffff6
Replace assert with unittest assertions (#13957) · e1bb2ebd
Luis F. Talavera R authored Oct 11, 2021

e1bb2ebd