Commits · 21468337673f6f93c1761580195e2cfa8d43774e · chenpangpang / transformers

07 Sep, 2021 4 commits
- Add unit_divisor to downloads (#13468) · 21468337
  Anton Lozhkov authored Sep 07, 2021
  
  21468337
- Optimized bad word ids (#13433) · 63b90a51
  guillaume-be authored Sep 07, 2021
```
* Optimized bad word ids generation

* Fixed optimized bad token ids

* Updated style
```
  63b90a51
- Fixing by correctly raising UnicodeDecodeError. (#13449) · 5c7789d4
  Nicolas Patry authored Sep 07, 2021
  
  5c7789d4
- Fix img classification tests (#13456) · 79815090
  Nathan Raw authored Sep 07, 2021
```
* ✅ Update image-classification example's tests

* 🔥 remove cats_and_dogs test samples

* 💄 fix flake8
```
  79815090
06 Sep, 2021 11 commits

Update setup.py (#13421) · 92d4ef9a
Anurag Kumar authored Sep 07, 2021

92d4ef9a
Update version of `packaging` package (#13454) · 75858ca1
Shiv Dhar authored Sep 07, 2021

75858ca1
Install libsndfile (#13403) · f8363e49
Anton Lozhkov authored Sep 07, 2021

f8363e49
Add TAPAS MLM-only models (#13408) · 5642a555
NielsRogge authored Sep 06, 2021
```
* Add conversion of TapasForMaskedLM

* Add copied from statements
```
5642a555
skip image classification test (#13451) · 2dd975b2
Suraj Patil authored Sep 06, 2021

2dd975b2

Update model configs - Allow setters for common properties (#13026) · c8be8a9a

Nils Reimers authored Sep 06, 2021

* refactor GPT Config to allow dyn. properties

* make attribute_map a class attribute

* remove old code

* update unit test to test config: Add test for common properties setter

* update unit test to test config: Add test for common properties passed as parameters to __init__

* update to black code format

* Allow that setters are not defined for certain config classes

* update config classes to implement attribute_map

* bugfix lxmert config - id2labels was not defined when num_labels was set

* update broken configs - add attribute_maps

* update bart config

* update black codestyle

* update documentation on common config attributes

* update GPTJ config to new attribute map

* update docs on common attributes

* gptj config: add max_position_embeddings

* gptj config: format with black

* update speech to text 2 config

* format doc file to max_len 119

* update config template

c8be8a9a

Adding a test for multibytes unicode. (#13447) · cf4eb8b3

Nicolas Patry authored Sep 06, 2021

* Adding a test for multibytes unicode.

* Adding some accents.

* Making sure decoding works.

* Make tests passing by being cheesy.

cf4eb8b3

up (#13448) · 607611f2
Patrick von Platen authored Sep 06, 2021

607611f2
add torchvision in example test requirements (#13438) · 6b29bff8
Suraj Patil authored Sep 06, 2021

6b29bff8
Fix scheduled tests for `SpeechEncoderDecoderModel` (#13422) · 26700a95
Anton Lozhkov authored Sep 06, 2021
```
* Add inputs to pretrained tests

* Make style
```
26700a95
Fix tests without any real effect (#13406) · 73ad2588
Yih-Dar authored Sep 06, 2021
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
73ad2588

02 Sep, 2021 11 commits

✨

Add PyTorch image classification example (#13134) · 76c4d8bf

Nathan Raw authored Sep 02, 2021

* ✨ add pytorch image classification example

* 🔥 remove utils.py

* 💄 fix flake8 style issues

* 🔥 remove unnecessary line

* ✨ limit dataset sizes

* 📌 update reqs

* 🎨 restructure - use datasets lib

* 🎨 import transforms directly

* 📝 add comments

* 💄 style

* 🔥 remove flag

* 📌 update requirement warning

* 📝 add vision README.md

* 📝 update README.md

* 📝 update README.md

* 🎨 add image-classification tag to model card

* 🚚 rename vision ➡️ image-classification

* 📝 update image-classification README.md

76c4d8bf

up (#13396) · 9bd5d97c
Patrick von Platen authored Sep 02, 2021

9bd5d97c
fix (#13395) · efa4f5f0
Patrick von Platen authored Sep 02, 2021

efa4f5f0

[docs] Update perplexity.rst to use negative log likelihood (#13386) · 596bb85f

Aman Madaan authored Sep 02, 2021

* [docs] Update perplexity.rst to use negative log likelihood

Model `forward` returns the negative log likelihood. The document correctly defines and calculates perplexity, but the description and variable names are inconsistent, which might cause confusion.

* [docs] restyle perplexity.rst

596bb85f

Correct order of overflowing_tokens for slow tokenizer (#13179) · b91e65af

Apoorv Garg authored Sep 02, 2021

* correct order of overflowing_tokens for slow tokenizer (issue fix #13148)

* python 3.9 requires sentencepiece version 0.1.94 or above

* slicing of ids fixed in truncated_sequence()

* Update setup.py

* Correct order of overflowing tokens for pair of sentences

* code reformatted

* Update tokenization_utils_base.py

* reformatting file

* test to check single_input added

* missing function restored

* test to check pair_input overflowing tokens order

* test to check pair_input overflowing tokens order

* test to check pair_input overflowing tokens order

* added an error message for pair of seq and longest_first strategy

* test for pair_input modified

* variable name corrected

* fixed a typo in error message

* requested changes implemented

* required test added

* Corrected the message to match test message

* added error message for Luke Tokenizer

* lost test recovered

* docstring for truncate_sequences and prepare_for_model updated

* docstring for luke tokenizer updated

* updated ENCODE_PLUS_ADDITIONAL_KWARGS_DOCSTRING

* aligned text and fixed puncuatations

* improved style and quality of code

* fixed error_msg in truncate_sequences

* replaced encode_plus method with regular call method

* clean up

* rephrased the docstring

b91e65af

Enabling automatic loading of tokenizer with `pipeline` for (#13376) · c9184a2e
Nicolas Patry authored Sep 02, 2021
```
`audio-classification`.
```
c9184a2e
fix example (#13387) · e92140c5
Suraj Patil authored Sep 02, 2021

e92140c5
Add tokenizer docs (#13373) · 4114c9a7
NielsRogge authored Sep 02, 2021

4114c9a7

Update clip loss calculation (#13217) · 872e6be0

Sachin Abeywardana authored Sep 02, 2021



* Update clip loss calculation

Hello, I'm the author of the blog you took the snippet from. I think this way of calculating is possibly slightly more accurate for calculation.

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>

872e6be0

[Flax/run_hybrid_clip] Fix duplicating images when captions_per_image exceeds... · 0a22335e
Eduardo Gonzalez Ponferrada authored Sep 01, 2021
```
[Flax/run_hybrid_clip] Fix duplicating images when captions_per_image exceeds the number of captions, enable truncation 
```
0a22335e
Fix name and get_class method in AutoFeatureExtractor (#13385) · c1c2d68d
Sylvain Gugger authored Sep 01, 2021

c1c2d68d

01 Sep, 2021 14 commits

fix (#13383) · a105c9b7
Patrick von Platen authored Sep 01, 2021

a105c9b7
[Flax] Fix BigBird (#13380) · 4475f1dc
Patrick von Platen authored Sep 01, 2021
```
* finish

* finish
```
4475f1dc
Fix RemBERT (#13375) · ecd53971
Lysandre Debut authored Sep 01, 2021

ecd53971
Add missing feature extractors (#13374) · 33b7c9a8
Lysandre Debut authored Sep 01, 2021

33b7c9a8
Add `Hubert` to the `AutoFeatureExtractor` (#13366) · 2406892a
Anton Lozhkov authored Sep 01, 2021
```
* Add Hubert to the auto feature extractor

* Fix import structure
```
2406892a
Properly register missing submodules in main init (#13372) · 6b353264
Sylvain Gugger authored Sep 01, 2021

6b353264
Fix assertion (#13369) · 4b7988eb
NielsRogge authored Sep 01, 2021

4b7988eb

Fix tokenizer saving during training with `Trainer` (#12806) · c4d78f01

SaulLu authored Sep 01, 2021



* add test in trainer and test tokenizer saving wi
th trainer

* quality

* reverse trainer changes

* replace test in test_trainer by a test for all the tokenizers

* format

* add can_save_slow_tokenizer attribute to all tokenizers

* fix Herbert

* format

* Change comment in error

* add comments and a new assert

* Update src/transformers/models/albert/tokenization_albert_fast.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change ValueError barthez

* change ValueError BigBird

* change ValueError Camembert

* change ValueError Mbart50

* change ValueError Pegasus

* change ValueError ReFormer

* change ValueError T5

* change ValueError RoBERTa

* XLNET fast

* Update tests/test_tokenization_common.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change `assert` into `self.assertIn`

* format
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

c4d78f01

Redeploy stable documentation · c1b20e42
Sylvain Gugger authored Sep 01, 2021

c1b20e42
Revert "Correct wrong function signatures on the docs website (#13198)" · 85cb4477
Li-Huai (Allan) Lin authored Aug 30, 2021
```
This reverts commit ffecfea9.
```
85cb4477

Improve T5 docs (#13240) · 4766e009

NielsRogge authored Sep 01, 2021



* Remove disclaimer

* First draft

* Fix rebase

* Improve docs some more

* Add inference section

* Improve example scripts section

* Improve code examples of modeling files

* Add docs regarding task prefix

* Address @craffel's comments

* Apply suggestions from @patrickvonplaten's review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add suggestions from code review

* Apply @sgugger's suggestions

* Fix Flax code examples

* Fix index.rst
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

4766e009

fix wrong 'cls' masking for bigbird qa model output (#13143) · ba1b3db7
donggyukimc authored Sep 01, 2021

ba1b3db7
Fixes for the documentation (#13361) · 7a26307e
Sylvain Gugger authored Sep 01, 2021

7a26307e

Add SpeechEncoderDecoder & Speech2Text2 (#13186) · 0b8c84e1

Patrick von Platen authored Sep 01, 2021



* fix_torch_device_generate_test

* remove @

* up

* correct some bugs

* correct model

* finish speech2text extension

* up

* up

* up

* up

* Update utils/custom_init_isort.py

* up

* up

* update with tokenizer

* correct old tok

* correct old tok

* fix bug

* up

* up

* add more tests

* up

* fix docs

* up

* fix some more tests

* add better config

* correct some more things
"

* fix tests

* improve docs

* Apply suggestions from code review

* Apply suggestions from code review

* final fixes

* finalize

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* apply suggestions Lysandre and Sylvain

* apply nicos suggestions

* upload everything

* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
Co-authored-by: your_github_username <your_github_email>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

0b8c84e1