Commits · 8b240a06617455eae59e1116af6a1a016664e963 · chenpangpang / transformers

12 Oct, 2021 1 commit

Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) · 8b240a06

Yih-Dar authored Oct 13, 2021



* Add cross attentions to TFGPT2Model

* Add TFEncoderDecoderModel

* Add TFBaseModelOutputWithPoolingAndCrossAttentions

* Add cross attentions to TFBertModel

* Fix past or past_key_values argument issue

* Fix generation

* Fix save and load

* Add some checks and comments

* Clean the code that deals with past keys/values

* Add kwargs to processing_inputs

* Add serving_output to TFEncoderDecoderModel

* Some cleaning + fix use_cache value issue

* Fix tests + add bert2bert/bert2gpt2 tests

* Fix more tests

* Ignore crossattention.bias when loading GPT2 weights into TFGPT2

* Fix return_dict_in_generate in tf generation

* Fix is_token_logit_eos_token bug in tf generation

* Finalize the tests after fixing some bugs

* Fix another is_token_logit_eos_token bug in tf generation

* Add/Update docs

* Add TFBertEncoderDecoderModelTest

* Clean test script

* Add TFEncoderDecoderModel to the library

* Add cross attentions to TFRobertaModel

* Add TFRobertaEncoderDecoderModelTest

* make style

* Change the way of position_ids computation

* bug fix

* Fix copies in tf_albert

* Remove some copied from and apply some fix-copies

* Remove some copied

* Add cross attentions to some other TF models

* Remove encoder_hidden_states from TFLayoutLMModel.call for now

* Make style

* Fix TFRemBertForCausalLM

* Revert the change to longformer + Remove copies

* Revert the change to albert and convbert + Remove copies

* make quality

* make style

* Add TFRembertEncoderDecoderModelTest

* make quality and fix-copies

* test TFRobertaForCausalLM

* Fixes for failed tests

* Fixes for failed tests

* fix more tests

* Fixes for failed tests

* Fix Auto mapping order

* Fix TFRemBertEncoder return value

* fix tf_rembert

* Check copies are OK

* Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined

* Add TFEncoderDecoderModelSaveLoadTests

* fix tf weight loading

* check the change of use_cache

* Revert the change

* Add missing test_for_causal_lm for TFRobertaModelTest

* Try cleaning past

* fix _reorder_cache

* Revert some files to original versions

* Keep as many copies as possible

* Apply suggested changes - Use raise ValueError instead of assert

* Move import to top

* Fix wrong require_torch

* Replace more assert by raise ValueError

* Add test_pt_tf_model_equivalence (the test won't pass for now)

* add test for loading/saving

* finish

* finish

* Remove test_pt_tf_model_equivalence

* Update tf modeling template

* Remove pooling, added in the prev. commit, from MainLayer

* Update tf modeling test template

* Move inputs["use_cache"] = False to modeling_tf_utils.py

* Fix torch.Tensor in the comment

* fix use_cache

* Fix missing use_cache in ElectraConfig

* Add a note to from_pretrained

* Fix style

* Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt

* Fix TFMLP (in TFGPT2) activation issue

* Fix None past_key_values value in serving_output

* Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub

* Apply review suggestions - style for cross_attns in serving_output

* Apply review suggestions - change assert + docstrings

* break the error message to respect the char limit

* deprecate the argument past

* fix docstring style

* Update the encoder-decoder rst file

* fix Unknown interpreted text role "method"

* fix typo
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

8b240a06

08 Oct, 2021 2 commits

Fixed typo: herBERT -> HerBERT (#13936) · 23ee06ed
Adam Kaczmarek authored Oct 08, 2021

23ee06ed

Image Segmentation pipeline (#13828) · 026866df

Mishig Davaadorj authored Oct 08, 2021



* Implement img seg pipeline

* Update src/transformers/pipelines/image_segmentation.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/image_segmentation.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update output shape with individual masks

* Rm dev change

* Remove loops in test
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

026866df

06 Oct, 2021 3 commits

Deploy docs for v4.11.3 · 5be59a36
Lysandre authored Oct 06, 2021

5be59a36
Autodocument the list of ONNX-supported models (#13884) · 7d83655d
Sylvain Gugger authored Oct 05, 2021

7d83655d

Update parallelism.md (#13892) · 36fc4016

Hyunwoong Ko authored Oct 06, 2021



* Update parallelism.md

* Update docs/source/parallelism.md
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update docs/source/parallelism.md
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update docs/source/parallelism.md
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update docs/source/parallelism.md
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update docs/source/parallelism.md
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update docs/source/parallelism.md
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

36fc4016

04 Oct, 2021 3 commits

Fix broken link to distill models in docs (#13848) · 6c088406
Evgeniy Zheltonozhskiy authored Oct 04, 2021
```
* Fix broken link to distill models

* Missing symbol

* Fix spaces
```
6c088406

Add Mistral GPT-2 Stability Tweaks (#13573) · 3a8de58c

Sidd Karamcheti authored Oct 04, 2021



* Add layer-wise scaling

* Add reorder & upcasting argument

* Add OpenAI GPT-2 weight initialization scheme

* start `layer_idx` count at zero for consistency

* disentangle attn and reordered and upscaled attn function

* rename `scale_attn_by_layer` to `scale_attn_by_layer_id`

* make autocast from amp compatible with pytorch<1.6

* fix docstring

* style fixes

* Add fixes from PR feedback, style tweaks

* Fix doc whitespace

* Reformat

* First pass scale_attn_by_layer_idx and reorder_and_upcast_attn tests

* Rename scale_attn_by_layer_idx, add tip

* Remove extra newline

* add test for weight initialization

* update code format

* add assert check weights are fp32

* remove assert

* Fix incorrect merge

* Fix shape mismatch in baddbmm

* Add generation test for Mistral flags
Co-authored-by: leandro <leandro.vonwerra@spoud.io>
Co-authored-by: Keshav Santhanam <keshav2@stanford.edu>
Co-authored-by: J38 <jebolton@stanford.edu>

3a8de58c

[docs/gpt-j] fix typo (#13851) · 955fd4fe
Yaser Abdelaziz authored Oct 04, 2021

955fd4fe

30 Sep, 2021 3 commits
- [DPR] Correct init (#13796) · 41436d3d
  Patrick von Platen authored Sep 30, 2021
```
* update

* add to docs and init

* make fix-copies
```
  41436d3d
- [testing] auto-replay captured streams (#13803) · e1d1c7c0
  Stas Bekman authored Sep 30, 2021
  
  e1d1c7c0
- Update doc for v4.11.2 · 5f25855b
  Sylvain Gugger authored Sep 30, 2021
  
  5f25855b
29 Sep, 2021 4 commits

[docs/gpt-j] addd instructions for how minimize CPU RAM usage (#13795) · bf6118e7
Suraj Patil authored Sep 29, 2021
```
* add a note about tokenizer

* add  tips to load model is less RAM

* fix link

* fix more links
```
bf6118e7
Update doc for v4.11.1 · cf4aa359
Sylvain Gugger authored Sep 29, 2021

cf4aa359
Fix LayoutLM ONNX test error (#13710) · a1ea3adb
Nishant Prabhu authored Sep 29, 2021
```
Fix LayoutLM ONNX test error
```
a1ea3adb

Keras callback to push to hub each epoch, or after N steps (#13773) · 3a8a8013

Matt authored Sep 29, 2021



* Keras callback to push to hub each epoch, or after N steps

* Reworked the callback to use Repository

* Use an Enum for save_strategy

* Style pass

* Correct type for tokenizer

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Adding print message to the final upload

* Adding print message to the final upload

* Change how we wait for the last process to finish

* is_done is a property, not a method, derp

* Docstrings and documentation

* Style pass

* Style edit

* Docstring reformat

* Docstring rewrite

* Replacing print with internal logger
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3a8a8013

27 Sep, 2021 2 commits
- Docs for version v4.11.0 · 11c69b80
  Lysandre authored Sep 27, 2021
  
  11c69b80
- Release: v4.11.0 · dc193c90
  Lysandre authored Sep 27, 2021
  
  dc193c90
22 Sep, 2021 3 commits

Add BlenderBot small tokenizer to the init (#13367) · 5b570754

Lysandre Debut authored Sep 22, 2021



* Add BlenderBot small tokenizer to the init

* Update src/transformers/__init__.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Style

* Bugfix
Co-authored-by: Suraj Patil <surajp815@gmail.com>

5b570754

add a note about tokenizer (#13696) · 6dc41d9f
Suraj Patil authored Sep 23, 2021

6dc41d9f

Make gradient_checkpointing a training argument (#13657) · 27d46397

Sylvain Gugger authored Sep 22, 2021



* Make gradient_checkpointing a training argument

* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/configuration_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix tests

* Style

* document Gradient Checkpointing as a performance feature

* Small rename

* PoC for not using the config

* Adapt BC to new PoC

* Forgot to save

* Rollout changes to all other models

* Fix typo
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>

27d46397

21 Sep, 2021 2 commits

beit-flax (#13515) · a2dec768

Kamal Raj authored Sep 21, 2021

* beit-flax

* updated FLAX_BEIT_MLM_DOCSTRING

* removed bool_masked_pos from classification

* updated Copyright

* code refactoring: x -> embeddings

* updated test: rm from_pt

* Update docs/source/model_doc/beit.rst

* model code dtype updates and
other changes according to review

* relative_position_bias
revert back to pytorch design

a2dec768

Add Speech AutoModels (#13655) · 48fa42e5
Patrick von Platen authored Sep 21, 2021
```
* upload

* correct

* correct

* correct

* finish

* up

* up

* up again
```
48fa42e5

20 Sep, 2021 4 commits

Fix typo distilbert doc (#13643) · ea921365
flozi00 authored Sep 20, 2021

ea921365
Change https:/ to https:// (#13644) · aeb2dac0
flozi00 authored Sep 20, 2021

aeb2dac0
Fix mT5 documentation (#13639) · 04976a32
Ayaka Mikazuki authored Sep 20, 2021
```
* Fix MT5 documentation

The abstract is incomplete

* MT5 -> mT5
```
04976a32

Add FNet (#13045) · d8049331

Gunjan Chhablani authored Sep 20, 2021



* Init FNet

* Update config

* Fix config

* Update model classes

* Update tokenizers to use sentencepiece

* Fix errors in model

* Fix defaults in config

* Remove position embedding type completely

* Fix typo and take only real numbers

* Fix type vocab size in configuration

* Add projection layer to embeddings

* Fix position ids bug in embeddings

* Add minor changes

* Add conversion script and remove CausalLM vestiges

* Fix conversion script

* Fix conversion script

* Remove CausalLM Test

* Update checkpoint names to dummy checkpoints

* Add tokenizer mapping

* Fix modeling file and corresponding tests

* Add tokenization test file

* Add PreTraining model test

* Make style and quality

* Make tokenization base tests work

* Update docs

* Add FastTokenizer tests

* Fix fast tokenizer special tokens

* Fix style and quality

* Remove load_tf_weights vestiges

* Add FNet to  main README

* Fix configuration example indentation

* Comment tokenization slow test

* Fix style

* Add changes from review

* Fix style

* Remove bos and eos tokens from tokenizers

* Add tokenizer slow test, TPU transforms, NSP

* Add scipy check

* Add scipy availabilty check to test

* Fix tokenizer and use correct inputs

* Remove remaining TODOs

* Fix tests

* Fix tests

* Comment Fourier Test

* Uncomment Fourier Test

* Change to google checkpoint

* Add changes from review

* Fix activation function

* Fix model integration test

* Add more integration tests

* Add comparison steps to MLM integration test

* Fix style

* Add masked tokenization fix

* Improve mask tokenization fix

* Fix index docs

* Add changes from review

* Fix issue

* Fix failing import in test

* some more fixes

* correct fast tokenizer

* finalize

* make style

* Remove additional tokenization logic

* Set do_lower_case to False

* Allow keeping accents

* Fix tokenization test

* Fix FNet Tokenizer Fast

* fix tests

* make style

* Add tips to FNet docs
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

d8049331

14 Sep, 2021 3 commits

separate model card git push from the rest (#13514) · 054b6013
elishowk authored Sep 14, 2021
```
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
054b6013

[Flax] Addition of FlaxPegasus (#13420) · c1e47bf4

Bhadresh Savani authored Sep 14, 2021



* added initial files

* fixes pipeline

* fixes style and quality

* fixes doc issue and positional encoding

* fixes layer norm and test

* fixes quality issue

* fixes code quality

* removed extra layer norm

* added layer norm back in encoder and decoder

* added more code copy quality checks

* update tests

* Apply suggestions from code review

* fix import

* fix test
Co-authored-by: patil-suraj <surajp815@gmail.com>

c1e47bf4

Push to hub when saving checkpoints (#13503) · 3081d386

Sylvain Gugger authored Sep 14, 2021

* Push to hub when saving checkpoints

* Add model card

* Revert partial model card

* Small fix for checkpoint

* Add tests

* Add documentation

* Fix tests

* Bump huggingface_hub

* Fix test

3081d386

13 Sep, 2021 1 commit

Small changes in `perplexity.rst`to make the notebook executable on google collaboratory (#13541) · 149c833b

SaulLu authored Sep 13, 2021



* add imports

* Update docs/source/perplexity.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

149c833b

10 Sep, 2021 2 commits

Docs for v4.10.1 · 72ec2f3e
patrickvonplaten authored Sep 10, 2021

72ec2f3e

[Large PR] Entire rework of pipelines. (#13308) · c63fcabf

Nicolas Patry authored Sep 10, 2021



* Enabling dataset iteration on pipelines.

Enabling dataset iteration on pipelines.

Unifying parameters under `set_parameters` function.

Small fix.

Last fixes after rebase

Remove print.

Fixing text2text `generate_kwargs`

No more `self.max_length`.

Fixing tf only conversational.

Consistency in start/stop index over TF/PT.

Speeding up drastically on TF (nasty bug where max_length would increase
a ton.)

Adding test for support for non fast tokenizers.

Fixign GPU usage on zero-shot.

Fix working on Tf.

Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Small cleanup.

Remove all asserts + simple format.

* Fixing audio-classification for large PR.

* Overly explicity null checking.

* Encapsulating GPU/CPU pytorch manipulation directly within `base.py`.

* Removed internal state for parameters of the  pipeline.

Instead of overriding implicitly internal state, we moved
to real named arguments on every `preprocess`, `_forward`,
`postprocess` function.

Instead `_sanitize_parameters` will be used to split all kwargs
of both __init__ and __call__ into the 3 kinds of named parameters.

* Move import warnings.

* Small fixes.

* Quality.

* Another small fix, using the CI to debug faster.

* Last fixes.

* Last fix.

* Small cleanup of tensor moving.

* is not None.

* Adding a bunch of docs + a iteration test.

* Fixing doc style.

* KeyDataset = None guard.

* RRemoving the Cuda test for pipelines (was testing).

* Even more simple iteration test.

* Correct import .

* Long day.

* Fixes in docs.

* [WIP] migrating object detection.

* Fixed the target_size bug.

* Fixup.

* Bad variable name.

* Fixing `ensure_on_device` respects original ModelOutput.

c63fcabf

08 Sep, 2021 4 commits

Fix typo in deepspeed documentation (#13482) · c3757380
Aleksander Smywiński-Pohl authored Sep 08, 2021
```
* Fix typo in deepspeed documentation

* Add missing import in deepspeed configuration
```
c3757380
fixed document (#13414) · 41cd52a7
Mohan Zhang authored Sep 08, 2021

41cd52a7

Object detection pipeline (#12886) · 2a15e8cc

Mishig Davaadorj authored Sep 08, 2021



* Implement object-detection pipeline

* Define threshold const

* Add `threshold` argument

* Refactor

* Uncomment test inputs

* `rm
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Chore better doc
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Rm unnecessary lines
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Chore better naming
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo

* Add `detr-tiny` for tests

* Add `ObjectDetectionPipeline` to `trnsfrmrs/init`

* Implement new bbox format

* Update detr post_process

* Update `load_img` method obj det pipeline

* make style

* Implement new testing format for obj det pipeln

* Add guard pytorch specific code in pipeline

* Add doc

* Make pipeline_obj_tet tests deterministic

* Revert some changes to `post_process` COCO api

* Chore

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Rm timm requirement

* make fixup

* Add timm requirement to test

* Make fixup

* Guard torch.Tensor

* Chore

* Delete unnecessary comment
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

2a15e8cc

Enable automated model list copying for localized READMEs (#13465) · 18447c20

Li-Huai (Allan) Lin authored Sep 08, 2021



* Complete basic mechanism

* Save

* Complete everything

* Style & Quality

* Update READMEs

* Add testing

* Fix README.md format

* Apply suggestions

* Fix format

* Update utils/check_copies.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

18447c20

07 Sep, 2021 1 commit

[docs] update dead quickstart link on resuing past for GPT2 (#13455) · 4be082ce

shabie authored Sep 07, 2021

* [docs] update dead quickstart link on resuing past for GPT2

Thed dead link have been replaced by two links of forward and call methods of the GPT2 class for torch and tensorflow respectively.

* [docs] fix formatting for gpt2 page update

4be082ce

06 Sep, 2021 1 commit

Update model configs - Allow setters for common properties (#13026) · c8be8a9a

Nils Reimers authored Sep 06, 2021

* refactor GPT Config to allow dyn. properties

* make attribute_map a class attribute

* remove old code

* update unit test to test config: Add test for common properties setter

* update unit test to test config: Add test for common properties passed as parameters to __init__

* update to black code format

* Allow that setters are not defined for certain config classes

* update config classes to implement attribute_map

* bugfix lxmert config - id2labels was not defined when num_labels was set

* update broken configs - add attribute_maps

* update bart config

* update black codestyle

* update documentation on common config attributes

* update GPTJ config to new attribute map

* update docs on common attributes

* gptj config: add max_position_embeddings

* gptj config: format with black

* update speech to text 2 config

* format doc file to max_len 119

* update config template

c8be8a9a

02 Sep, 2021 1 commit

[docs] Update perplexity.rst to use negative log likelihood (#13386) · 596bb85f

Aman Madaan authored Sep 02, 2021

* [docs] Update perplexity.rst to use negative log likelihood

Model `forward` returns the negative log likelihood. The document correctly defines and calculates perplexity, but the description and variable names are inconsistent, which might cause confusion.

* [docs] restyle perplexity.rst

596bb85f