Commits · efa4f5f0ea29495f3c895515e0ccdfba8e6980a1 · chenpangpang / transformers

02 Sep, 2021 9 commits

fix (#13395) · efa4f5f0
Patrick von Platen authored Sep 02, 2021

efa4f5f0

[docs] Update perplexity.rst to use negative log likelihood (#13386) · 596bb85f

Aman Madaan authored Sep 02, 2021

* [docs] Update perplexity.rst to use negative log likelihood

Model `forward` returns the negative log likelihood. The document correctly defines and calculates perplexity, but the description and variable names are inconsistent, which might cause confusion.

* [docs] restyle perplexity.rst

596bb85f

Correct order of overflowing_tokens for slow tokenizer (#13179) · b91e65af

Apoorv Garg authored Sep 02, 2021

* correct order of overflowing_tokens for slow tokenizer (issue fix #13148)

* python 3.9 requires sentencepiece version 0.1.94 or above

* slicing of ids fixed in truncated_sequence()

* Update setup.py

* Correct order of overflowing tokens for pair of sentences

* code reformatted

* Update tokenization_utils_base.py

* reformatting file

* test to check single_input added

* missing function restored

* test to check pair_input overflowing tokens order

* test to check pair_input overflowing tokens order

* test to check pair_input overflowing tokens order

* added an error message for pair of seq and longest_first strategy

* test for pair_input modified

* variable name corrected

* fixed a typo in error message

* requested changes implemented

* required test added

* Corrected the message to match test message

* added error message for Luke Tokenizer

* lost test recovered

* docstring for truncate_sequences and prepare_for_model updated

* docstring for luke tokenizer updated

* updated ENCODE_PLUS_ADDITIONAL_KWARGS_DOCSTRING

* aligned text and fixed puncuatations

* improved style and quality of code

* fixed error_msg in truncate_sequences

* replaced encode_plus method with regular call method

* clean up

* rephrased the docstring

b91e65af

Enabling automatic loading of tokenizer with `pipeline` for (#13376) · c9184a2e
Nicolas Patry authored Sep 02, 2021
```
`audio-classification`.
```
c9184a2e
fix example (#13387) · e92140c5
Suraj Patil authored Sep 02, 2021

e92140c5
Add tokenizer docs (#13373) · 4114c9a7
NielsRogge authored Sep 02, 2021

4114c9a7

Update clip loss calculation (#13217) · 872e6be0

Sachin Abeywardana authored Sep 02, 2021



* Update clip loss calculation

Hello, I'm the author of the blog you took the snippet from. I think this way of calculating is possibly slightly more accurate for calculation.

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>

872e6be0

[Flax/run_hybrid_clip] Fix duplicating images when captions_per_image exceeds... · 0a22335e
Eduardo Gonzalez Ponferrada authored Sep 01, 2021
```
[Flax/run_hybrid_clip] Fix duplicating images when captions_per_image exceeds the number of captions, enable truncation 
```
0a22335e
Fix name and get_class method in AutoFeatureExtractor (#13385) · c1c2d68d
Sylvain Gugger authored Sep 01, 2021

c1c2d68d

01 Sep, 2021 26 commits

fix (#13383) · a105c9b7
Patrick von Platen authored Sep 01, 2021

a105c9b7
[Flax] Fix BigBird (#13380) · 4475f1dc
Patrick von Platen authored Sep 01, 2021
```
* finish

* finish
```
4475f1dc
Fix RemBERT (#13375) · ecd53971
Lysandre Debut authored Sep 01, 2021

ecd53971
Add missing feature extractors (#13374) · 33b7c9a8
Lysandre Debut authored Sep 01, 2021

33b7c9a8
Add `Hubert` to the `AutoFeatureExtractor` (#13366) · 2406892a
Anton Lozhkov authored Sep 01, 2021
```
* Add Hubert to the auto feature extractor

* Fix import structure
```
2406892a
Properly register missing submodules in main init (#13372) · 6b353264
Sylvain Gugger authored Sep 01, 2021

6b353264
Fix assertion (#13369) · 4b7988eb
NielsRogge authored Sep 01, 2021

4b7988eb

Fix tokenizer saving during training with `Trainer` (#12806) · c4d78f01

SaulLu authored Sep 01, 2021



* add test in trainer and test tokenizer saving wi
th trainer

* quality

* reverse trainer changes

* replace test in test_trainer by a test for all the tokenizers

* format

* add can_save_slow_tokenizer attribute to all tokenizers

* fix Herbert

* format

* Change comment in error

* add comments and a new assert

* Update src/transformers/models/albert/tokenization_albert_fast.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change ValueError barthez

* change ValueError BigBird

* change ValueError Camembert

* change ValueError Mbart50

* change ValueError Pegasus

* change ValueError ReFormer

* change ValueError T5

* change ValueError RoBERTa

* XLNET fast

* Update tests/test_tokenization_common.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change `assert` into `self.assertIn`

* format
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

c4d78f01

Redeploy stable documentation · c1b20e42
Sylvain Gugger authored Sep 01, 2021

c1b20e42
Revert "Correct wrong function signatures on the docs website (#13198)" · 85cb4477
Li-Huai (Allan) Lin authored Aug 30, 2021
```
This reverts commit ffecfea9.
```
85cb4477

Improve T5 docs (#13240) · 4766e009

NielsRogge authored Sep 01, 2021



* Remove disclaimer

* First draft

* Fix rebase

* Improve docs some more

* Add inference section

* Improve example scripts section

* Improve code examples of modeling files

* Add docs regarding task prefix

* Address @craffel's comments

* Apply suggestions from @patrickvonplaten's review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add suggestions from code review

* Apply @sgugger's suggestions

* Fix Flax code examples

* Fix index.rst
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

4766e009

fix wrong 'cls' masking for bigbird qa model output (#13143) · ba1b3db7
donggyukimc authored Sep 01, 2021

ba1b3db7
Fixes for the documentation (#13361) · 7a26307e
Sylvain Gugger authored Sep 01, 2021

7a26307e

Add SpeechEncoderDecoder & Speech2Text2 (#13186) · 0b8c84e1

Patrick von Platen authored Sep 01, 2021



* fix_torch_device_generate_test

* remove @

* up

* correct some bugs

* correct model

* finish speech2text extension

* up

* up

* up

* up

* Update utils/custom_init_isort.py

* up

* up

* update with tokenizer

* correct old tok

* correct old tok

* fix bug

* up

* up

* add more tests

* up

* fix docs

* up

* fix some more tests

* add better config

* correct some more things
"

* fix tests

* improve docs

* Apply suggestions from code review

* Apply suggestions from code review

* final fixes

* finalize

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* apply suggestions Lysandre and Sylvain

* apply nicos suggestions

* upload everything

* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
Co-authored-by: your_github_username <your_github_email>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

0b8c84e1

Fix GPT-J _CHECKPOINT_FOR_DOC typo (#13368) · 9396b404
Lysandre Debut authored Sep 01, 2021

9396b404

Fix for the issue of device-id getting hardcoded for token_type_ids during... · 53ee995a

Hamid Shojanazeri authored Sep 01, 2021

Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing for ConvBert (#12287)

* added token_type_ids buffer to fix the issue #5664

* Handling the case that position_id buffer is not registered

* added token_type_ids buffer to fix the issue #5664

* modified to support device conversion when the model is traced

53ee995a

Fix for the issue of device-id getting hardcoded for position-ids during... · 5adf5cab

Hamid Shojanazeri authored Sep 01, 2021

Fix for the issue of device-id getting hardcoded for position-ids during Tracing for Distillbert (#12290)

* registered buffer for position-ids to address issues similar to issue#5664

* added comment

* added the flag to prevent from adding the buffer into the state_dict

5adf5cab

Fix for the issue of device-id getting hardcoded for position-ids during... · 5d1a3d13

Hamid Shojanazeri authored Sep 01, 2021

Fix for the issue of device-id getting hardcoded for position-ids during Tracing for Flaubert (#12292)

* adding position_ids buffer to fix the issue simialr to #5664

* adding position-id buffer to address similar issues to #5664

5d1a3d13

Torchscript test for Flaubert (#13353) · 58e999b7

Lysandre Debut authored Sep 01, 2021

* Torchscript test for Flaubert

* Update tests/test_modeling_flaubert.py

* Update tests/test_modeling_flaubert.py

58e999b7

Torchscript test for ConvBERT (#13352) · d07c771d
Lysandre Debut authored Sep 01, 2021
```
* Torchscript test for ConvBERT

* Apply suggestions from code review
```
d07c771d
Torchscript test for DistilBERT (#13351) · 680733a7
Lysandre Debut authored Sep 01, 2021
```
* Torchscript test for DistilBERT

* Update tests/test_modeling_distilbert.py
```
680733a7
Torchscript test (#13350) · 73a03812
Lysandre Debut authored Sep 01, 2021
```
* Torchscript test

* Remove print statement
```
73a03812

Add the `AudioClassificationPipeline` (#13342) · b9c6a976

Anton Lozhkov authored Sep 01, 2021

* Add the audio classification pipeline

* Remove autoconfig exception

* Mark ffmpeg test as slow

* Rearrange pipeline tests

* Add small test

* Replace asserts with ValueError

b9c6a976

Update README.md · 02039352
Patrick von Platen authored Sep 01, 2021

02039352

Add template for adding flax models (#12441) · d160782a

Jonathan Chang authored Sep 01, 2021



* Add option to add flax

* Add flax template for __init__.py

* Add flax template for .rst

* Copy TF modeling template

* Add a missing line in modeling_tf_... template

* Update first half of modeling_flax_..

* Update encoder flax template

* Copy test_modeling_tf... as test_modeling_flax...

* Replace some TF to Flax in test_modeling_flax_...

* Replace tf to np

some function might not work, like _assert_tensors_equal

* Replace remaining tf to np (might not work)

* Fix cookiecutter

* Add Flax in to_replace_... template

* Update transformers-cli add-new-model

* Save generate_flax in configuration.json

This will be read by transformers-cli

* Fix to_replace_... and cli

* Fix replace cli

* Fix cookiecutter name

* Move docstring earlier to avoid not defined error

* Fix a missing Module

* Add encoder-decoder flax template from bart

* Fix flax test

* Make style

* Fix endif

* Fix replace all "utf-8 -> unp-8"

* Update comment

* Fix flax template (add missing ..._DOCSTRING)

* Use flax_bart imports in template (was t5)

* Fix unp

* Update templates/adding_a_new_model/tests

* Revert "Fix unp"

This reverts commit dc9002a41d902c4f9b07343eab1cb350c8b7fd57.

* Remove one line of copied from to suppress CI error

* Use generate_tensorflow_pytorch_and_flax

* Add a missing part

* fix typo

* fix flax config

* add examples for flax

* small rename

* correct modeling imports

* correct auto loading

* corrects some flax tests

* correct small typo

* correct as type

* finish modif

* correct more templates

* final fixes

* add file testers

* up

* make sure tests match template regex

* correct pytorch

* correct tf

* correct more tf

* correct imports

* minor error

* minor error

* correct init

* more fixes

* correct more flax tests

* correct flax test

* more fixes

* correct docs

* update

* fix
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

d160782a

Update self-push.yml (#13364) · 8e208878
Patrick von Platen authored Sep 01, 2021

8e208878

31 Aug, 2021 5 commits

GPT-J-6B (#13022) · c02cd95c

Stella Biderman authored Aug 31, 2021



* Test GPTJ implementation

* Fixed conflicts

* Update __init__.py

* Update __init__.py

* change GPT_J to GPTJ

* fix missing imports and typos

* use einops for now
(need to change to torch ops later)

* Use torch ops instead of einsum

* remove einops deps

* Update configuration_auto.py

* Added GPT J

* Update gptj.rst

* Update __init__.py

* Update test_modeling_gptj.py

* Added GPT J

* Changed configs to match GPT2 instead of GPT Neo

* Removed non-existent sequence model

* Update configuration_auto.py

* Update configuration_auto.py

* Update configuration_auto.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Progress on updating configs to agree with GPT2

* Update modeling_gptj.py

* num_layers -> n_layer

* layer_norm_eps -> layer_norm_epsilon

* attention_layers -> num_hidden_layers

* Update modeling_gptj.py

* attention_pdrop -> attn_pdrop

* hidden_act -> activation_function

* Update configuration_gptj.py

* Update configuration_gptj.py

* Update configuration_gptj.py

* Update configuration_gptj.py

* Update configuration_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* fix layernorm and lm_head size
delete attn_type

* Update docs/source/model_doc/gptj.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* removed claim that GPT J uses local attention

* Removed GPTJForSequenceClassification

* Update src/transformers/models/gptj/configuration_gptj.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Removed unsupported boilerplate

* Update tests/test_modeling_gptj.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update tests/test_modeling_gptj.py
Co-authored-by: Eric Hallahan <eric@hallahans.name>

* Update tests/test_modeling_gptj.py
Co-authored-by: Eric Hallahan <eric@hallahans.name>

* Update tests/test_modeling_gptj.py
Co-authored-by: Eric Hallahan <eric@hallahans.name>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update __init__.py

* Update configuration_gptj.py

* Update modeling_gptj.py

* Corrected indentation

* Remove stray backslash

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Update docs to match

* Remove tf loading

* Remove config.jax

* Remove stray `else:` statement

* Remove references to `load_tf_weights_in_gptj`

* Adapt tests to match output from GPT-J 6B

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Default `activation_function` to `gelu_new`

- Specify the approximate formulation of GELU to ensure parity with the default setting of `jax.nn.gelu()`

* Fix part of the config documentation

* Revert "Update configuration_auto.py"

This reverts commit e9860e9c043b6ebf57a0e705044e9ec9ba2263bb.

* Revert "Update configuration_auto.py"

This reverts commit cfaaae4c4dc70f1fbe9abd60fc8bd0b863b8c011.

* Revert "Update configuration_auto.py"

This reverts commit 687788954fd0cfbc567fa1202d56a4ff9271944f.

* Revert "Update configuration_auto.py"

This reverts commit 194d024ea87d4fcef0dcb08e57f52c47511a9fc6.

* Hyphenate GPT-J

* Undid sorting of the models alphabetically

* Reverting previous commit

* fix style and quality issues

* Update docs/source/model_doc/gptj.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/configuration_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/configuration_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/configuration_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Replaced GPTJ-specific code with generic code

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Made the code always use rotary positional encodings

* Update index.rst

* Fix documentation

* Combine attention classes

- Condense all attention operations into `GPTJAttention`
- Replicate GPT-2 and improve code clarity by renaming `GPTJAttention.attn_pdrop` and `GPTJAttention.resid_pdrop` to `GPTJAttention.attn_dropout` and `GPTJAttention.resid_dropout`

* Removed `config.rotary_dim` from tests

* Update test_modeling_gptj.py

* Update test_modeling_gptj.py

* Fix formatting

* Removed depreciated argument `layer_id` to `GPTJAttention`

* Update modeling_gptj.py

* Update modeling_gptj.py

* Fix code quality

* Restore model functionality

* Save `lm_head.weight` in checkpoints

* Fix crashes when loading with reduced precision

* refactor self._attn(...)` and rename layer weights"

* make sure logits are in fp32 for sampling

* improve docs

* Add `GPTJForCausalLM` to `TextGenerationPipeline` whitelist

* Added GPT-J to the README

* Fix doc/readme consistency

* Add rough parallelization support

- Remove unused imports and variables
- Clean up docstrings
- Port experimental parallelization code from GPT-2 into GPT-J

* Clean up loose ends

* Fix index.rst
Co-authored-by: kurumuz <kurumuz1@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Eric Hallahan <eric@hallahans.name>
Co-authored-by: Leo Gao <54557097+leogao2@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: your_github_username <your_github_email>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

c02cd95c

Re-deploy documentation · e53af030
Lysandre authored Aug 31, 2021

e53af030
Adjust documentation index · 20677b22
Lysandre authored Aug 31, 2021

20677b22
Docs for v4.10.0 · 5ee67a44
Lysandre authored Aug 31, 2021

5ee67a44
Release: v4.10.0 · d12bbe49
Lysandre authored Aug 31, 2021

d12bbe49