Commits · 23ee06ed5551cf5d5701ed2e917c60b6dd584a88 · chenpangpang / transformers

"vscode:/vscode.git/clone" did not exist on "ca3df9f0cfc73adbda59fd8044527d40fc09ffff"

08 Oct, 2021 2 commits

Fixed typo: herBERT -> HerBERT (#13936) · 23ee06ed
Adam Kaczmarek authored Oct 08, 2021

23ee06ed

Image Segmentation pipeline (#13828) · 026866df

Mishig Davaadorj authored Oct 08, 2021



* Implement img seg pipeline

* Update src/transformers/pipelines/image_segmentation.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/image_segmentation.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update output shape with individual masks

* Rm dev change

* Remove loops in test
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

026866df

04 Oct, 2021 2 commits

Add Mistral GPT-2 Stability Tweaks (#13573) · 3a8de58c

Sidd Karamcheti authored Oct 04, 2021



* Add layer-wise scaling

* Add reorder & upcasting argument

* Add OpenAI GPT-2 weight initialization scheme

* start `layer_idx` count at zero for consistency

* disentangle attn and reordered and upscaled attn function

* rename `scale_attn_by_layer` to `scale_attn_by_layer_id`

* make autocast from amp compatible with pytorch<1.6

* fix docstring

* style fixes

* Add fixes from PR feedback, style tweaks

* Fix doc whitespace

* Reformat

* First pass scale_attn_by_layer_idx and reorder_and_upcast_attn tests

* Rename scale_attn_by_layer_idx, add tip

* Remove extra newline

* add test for weight initialization

* update code format

* add assert check weights are fp32

* remove assert

* Fix incorrect merge

* Fix shape mismatch in baddbmm

* Add generation test for Mistral flags
Co-authored-by: leandro <leandro.vonwerra@spoud.io>
Co-authored-by: Keshav Santhanam <keshav2@stanford.edu>
Co-authored-by: J38 <jebolton@stanford.edu>

3a8de58c

[docs/gpt-j] fix typo (#13851) · 955fd4fe
Yaser Abdelaziz authored Oct 04, 2021

955fd4fe

30 Sep, 2021 1 commit
- [DPR] Correct init (#13796) · 41436d3d
  Patrick von Platen authored Sep 30, 2021
```
* update

* add to docs and init

* make fix-copies
```
  41436d3d
29 Sep, 2021 1 commit
- [docs/gpt-j] addd instructions for how minimize CPU RAM usage (#13795) · bf6118e7
  Suraj Patil authored Sep 29, 2021
```
* add a note about tokenizer

* add  tips to load model is less RAM

* fix link

* fix more links
```
  bf6118e7
22 Sep, 2021 3 commits

Add BlenderBot small tokenizer to the init (#13367) · 5b570754

Lysandre Debut authored Sep 22, 2021



* Add BlenderBot small tokenizer to the init

* Update src/transformers/__init__.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Style

* Bugfix
Co-authored-by: Suraj Patil <surajp815@gmail.com>

5b570754

add a note about tokenizer (#13696) · 6dc41d9f
Suraj Patil authored Sep 23, 2021

6dc41d9f

Make gradient_checkpointing a training argument (#13657) · 27d46397

Sylvain Gugger authored Sep 22, 2021



* Make gradient_checkpointing a training argument

* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/configuration_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix tests

* Style

* document Gradient Checkpointing as a performance feature

* Small rename

* PoC for not using the config

* Adapt BC to new PoC

* Forgot to save

* Rollout changes to all other models

* Fix typo
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>

27d46397

21 Sep, 2021 2 commits

beit-flax (#13515) · a2dec768

Kamal Raj authored Sep 21, 2021

* beit-flax

* updated FLAX_BEIT_MLM_DOCSTRING

* removed bool_masked_pos from classification

* updated Copyright

* code refactoring: x -> embeddings

* updated test: rm from_pt

* Update docs/source/model_doc/beit.rst

* model code dtype updates and
other changes according to review

* relative_position_bias
revert back to pytorch design

a2dec768

Add Speech AutoModels (#13655) · 48fa42e5
Patrick von Platen authored Sep 21, 2021
```
* upload

* correct

* correct

* correct

* finish

* up

* up

* up again
```
48fa42e5

20 Sep, 2021 3 commits

Fix typo distilbert doc (#13643) · ea921365
flozi00 authored Sep 20, 2021

ea921365
Fix mT5 documentation (#13639) · 04976a32
Ayaka Mikazuki authored Sep 20, 2021
```
* Fix MT5 documentation

The abstract is incomplete

* MT5 -> mT5
```
04976a32

Add FNet (#13045) · d8049331

Gunjan Chhablani authored Sep 20, 2021



* Init FNet

* Update config

* Fix config

* Update model classes

* Update tokenizers to use sentencepiece

* Fix errors in model

* Fix defaults in config

* Remove position embedding type completely

* Fix typo and take only real numbers

* Fix type vocab size in configuration

* Add projection layer to embeddings

* Fix position ids bug in embeddings

* Add minor changes

* Add conversion script and remove CausalLM vestiges

* Fix conversion script

* Fix conversion script

* Remove CausalLM Test

* Update checkpoint names to dummy checkpoints

* Add tokenizer mapping

* Fix modeling file and corresponding tests

* Add tokenization test file

* Add PreTraining model test

* Make style and quality

* Make tokenization base tests work

* Update docs

* Add FastTokenizer tests

* Fix fast tokenizer special tokens

* Fix style and quality

* Remove load_tf_weights vestiges

* Add FNet to  main README

* Fix configuration example indentation

* Comment tokenization slow test

* Fix style

* Add changes from review

* Fix style

* Remove bos and eos tokens from tokenizers

* Add tokenizer slow test, TPU transforms, NSP

* Add scipy check

* Add scipy availabilty check to test

* Fix tokenizer and use correct inputs

* Remove remaining TODOs

* Fix tests

* Fix tests

* Comment Fourier Test

* Uncomment Fourier Test

* Change to google checkpoint

* Add changes from review

* Fix activation function

* Fix model integration test

* Add more integration tests

* Add comparison steps to MLM integration test

* Fix style

* Add masked tokenization fix

* Improve mask tokenization fix

* Fix index docs

* Add changes from review

* Fix issue

* Fix failing import in test

* some more fixes

* correct fast tokenizer

* finalize

* make style

* Remove additional tokenization logic

* Set do_lower_case to False

* Allow keeping accents

* Fix tokenization test

* Fix FNet Tokenizer Fast

* fix tests

* make style

* Add tips to FNet docs
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

d8049331

14 Sep, 2021 1 commit

[Flax] Addition of FlaxPegasus (#13420) · c1e47bf4

Bhadresh Savani authored Sep 14, 2021



* added initial files

* fixes pipeline

* fixes style and quality

* fixes doc issue and positional encoding

* fixes layer norm and test

* fixes quality issue

* fixes code quality

* removed extra layer norm

* added layer norm back in encoder and decoder

* added more code copy quality checks

* update tests

* Apply suggestions from code review

* fix import

* fix test
Co-authored-by: patil-suraj <surajp815@gmail.com>

c1e47bf4

08 Sep, 2021 1 commit

Object detection pipeline (#12886) · 2a15e8cc

Mishig Davaadorj authored Sep 08, 2021



* Implement object-detection pipeline

* Define threshold const

* Add `threshold` argument

* Refactor

* Uncomment test inputs

* `rm
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Chore better doc
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Rm unnecessary lines
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Chore better naming
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo

* Add `detr-tiny` for tests

* Add `ObjectDetectionPipeline` to `trnsfrmrs/init`

* Implement new bbox format

* Update detr post_process

* Update `load_img` method obj det pipeline

* make style

* Implement new testing format for obj det pipeln

* Add guard pytorch specific code in pipeline

* Add doc

* Make pipeline_obj_tet tests deterministic

* Revert some changes to `post_process` COCO api

* Chore

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Rm timm requirement

* make fixup

* Add timm requirement to test

* Make fixup

* Guard torch.Tensor

* Chore

* Delete unnecessary comment
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

2a15e8cc

07 Sep, 2021 1 commit

[docs] update dead quickstart link on resuing past for GPT2 (#13455) · 4be082ce

shabie authored Sep 07, 2021

* [docs] update dead quickstart link on resuing past for GPT2

Thed dead link have been replaced by two links of forward and call methods of the GPT2 class for torch and tensorflow respectively.

* [docs] fix formatting for gpt2 page update

4be082ce

02 Sep, 2021 2 commits
- fix example (#13387) · e92140c5
  Suraj Patil authored Sep 02, 2021
  
  e92140c5
- Add tokenizer docs (#13373) · 4114c9a7
  NielsRogge authored Sep 02, 2021
  
  4114c9a7
01 Sep, 2021 3 commits

Improve T5 docs (#13240) · 4766e009

NielsRogge authored Sep 01, 2021



* Remove disclaimer

* First draft

* Fix rebase

* Improve docs some more

* Add inference section

* Improve example scripts section

* Improve code examples of modeling files

* Add docs regarding task prefix

* Address @craffel's comments

* Apply suggestions from @patrickvonplaten's review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add suggestions from code review

* Apply @sgugger's suggestions

* Fix Flax code examples

* Fix index.rst
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

4766e009

Add SpeechEncoderDecoder & Speech2Text2 (#13186) · 0b8c84e1

Patrick von Platen authored Sep 01, 2021



* fix_torch_device_generate_test

* remove @

* up

* correct some bugs

* correct model

* finish speech2text extension

* up

* up

* up

* up

* Update utils/custom_init_isort.py

* up

* up

* update with tokenizer

* correct old tok

* correct old tok

* fix bug

* up

* up

* add more tests

* up

* fix docs

* up

* fix some more tests

* add better config

* correct some more things
"

* fix tests

* improve docs

* Apply suggestions from code review

* Apply suggestions from code review

* final fixes

* finalize

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* apply suggestions Lysandre and Sylvain

* apply nicos suggestions

* upload everything

* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
Co-authored-by: your_github_username <your_github_email>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

0b8c84e1

Add the `AudioClassificationPipeline` (#13342) · b9c6a976

Anton Lozhkov authored Sep 01, 2021

* Add the audio classification pipeline

* Remove autoconfig exception

* Mark ffmpeg test as slow

* Rearrange pipeline tests

* Add small test

* Replace asserts with ValueError

b9c6a976

31 Aug, 2021 3 commits

GPT-J-6B (#13022) · c02cd95c

Stella Biderman authored Aug 31, 2021



* Test GPTJ implementation

* Fixed conflicts

* Update __init__.py

* Update __init__.py

* change GPT_J to GPTJ

* fix missing imports and typos

* use einops for now
(need to change to torch ops later)

* Use torch ops instead of einsum

* remove einops deps

* Update configuration_auto.py

* Added GPT J

* Update gptj.rst

* Update __init__.py

* Update test_modeling_gptj.py

* Added GPT J

* Changed configs to match GPT2 instead of GPT Neo

* Removed non-existent sequence model

* Update configuration_auto.py

* Update configuration_auto.py

* Update configuration_auto.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Progress on updating configs to agree with GPT2

* Update modeling_gptj.py

* num_layers -> n_layer

* layer_norm_eps -> layer_norm_epsilon

* attention_layers -> num_hidden_layers

* Update modeling_gptj.py

* attention_pdrop -> attn_pdrop

* hidden_act -> activation_function

* Update configuration_gptj.py

* Update configuration_gptj.py

* Update configuration_gptj.py

* Update configuration_gptj.py

* Update configuration_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* Update modeling_gptj.py

* fix layernorm and lm_head size
delete attn_type

* Update docs/source/model_doc/gptj.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* removed claim that GPT J uses local attention

* Removed GPTJForSequenceClassification

* Update src/transformers/models/gptj/configuration_gptj.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Removed unsupported boilerplate

* Update tests/test_modeling_gptj.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update tests/test_modeling_gptj.py
Co-authored-by: Eric Hallahan <eric@hallahans.name>

* Update tests/test_modeling_gptj.py
Co-authored-by: Eric Hallahan <eric@hallahans.name>

* Update tests/test_modeling_gptj.py
Co-authored-by: Eric Hallahan <eric@hallahans.name>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update __init__.py

* Update configuration_gptj.py

* Update modeling_gptj.py

* Corrected indentation

* Remove stray backslash

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Update docs to match

* Remove tf loading

* Remove config.jax

* Remove stray `else:` statement

* Remove references to `load_tf_weights_in_gptj`

* Adapt tests to match output from GPT-J 6B

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Default `activation_function` to `gelu_new`

- Specify the approximate formulation of GELU to ensure parity with the default setting of `jax.nn.gelu()`

* Fix part of the config documentation

* Revert "Update configuration_auto.py"

This reverts commit e9860e9c043b6ebf57a0e705044e9ec9ba2263bb.

* Revert "Update configuration_auto.py"

This reverts commit cfaaae4c4dc70f1fbe9abd60fc8bd0b863b8c011.

* Revert "Update configuration_auto.py"

This reverts commit 687788954fd0cfbc567fa1202d56a4ff9271944f.

* Revert "Update configuration_auto.py"

This reverts commit 194d024ea87d4fcef0dcb08e57f52c47511a9fc6.

* Hyphenate GPT-J

* Undid sorting of the models alphabetically

* Reverting previous commit

* fix style and quality issues

* Update docs/source/model_doc/gptj.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/configuration_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/configuration_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/configuration_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Replaced GPTJ-specific code with generic code

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Made the code always use rotary positional encodings

* Update index.rst

* Fix documentation

* Combine attention classes

- Condense all attention operations into `GPTJAttention`
- Replicate GPT-2 and improve code clarity by renaming `GPTJAttention.attn_pdrop` and `GPTJAttention.resid_pdrop` to `GPTJAttention.attn_dropout` and `GPTJAttention.resid_dropout`

* Removed `config.rotary_dim` from tests

* Update test_modeling_gptj.py

* Update test_modeling_gptj.py

* Fix formatting

* Removed depreciated argument `layer_id` to `GPTJAttention`

* Update modeling_gptj.py

* Update modeling_gptj.py

* Fix code quality

* Restore model functionality

* Save `lm_head.weight` in checkpoints

* Fix crashes when loading with reduced precision

* refactor self._attn(...)` and rename layer weights"

* make sure logits are in fp32 for sampling

* improve docs

* Add `GPTJForCausalLM` to `TextGenerationPipeline` whitelist

* Added GPT-J to the README

* Fix doc/readme consistency

* Add rough parallelization support

- Remove unused imports and variables
- Clean up docstrings
- Port experimental parallelization code from GPT-2 into GPT-J

* Clean up loose ends

* Fix index.rst
Co-authored-by: kurumuz <kurumuz1@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Eric Hallahan <eric@hallahans.name>
Co-authored-by: Leo Gao <54557097+leogao2@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: your_github_username <your_github_email>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

c02cd95c

Deberta_v2 tf (#13120) · 3efcfeab

Kamal Raj authored Aug 31, 2021

* Deberta_v2 tf

* added new line at the end of file, make style

* +V2, typo

* remove never executed branch of code

* rm cmnt and fixed typo in url filter

* cleanup according to review comments

* added #Copied from

3efcfeab

Add GPT2ForTokenClassification (#13290) · 41c55941

tucan9389 authored Aug 31, 2021



* Add GPT2ForTokenClassification

* Fix dropout exception for GPT2 NER

* Remove sequence label in test

* Change TokenClassifierOutput to TokenClassifierOutputWithPast

* Fix for black formatter

* Remove dummy

* Update docs for GPT2ForTokenClassification

* Fix check_inits ci fail

* Update dummy_pt_objects after make fix-copies

* Remove TokenClassifierOutputWithPast

* Fix tuple input issue
Co-authored-by: danielsejong55@gmail.com <danielsejong55@gmail.com>

41c55941

30 Aug, 2021 4 commits

albert flax (#13294) · 98e409ab

Kamal Raj authored Aug 30, 2021

* albert flax

* year -> 2021

* docstring updated for flax

* removed head_mask

* removed from_pt

* removed passing attention_mask to embedding layer

98e409ab

distilbert-flax (#13324) · 774760e6

Kamal Raj authored Aug 30, 2021

* distilbert-flax

* added missing self

* docs fix

* removed tied kernal extra init

* updated docs

* x -> hidden states

* removed head_mask

* removed from_pt, +FLAX

* updated year

774760e6

fix: typo spelling grammar (#13212) · 01977466
arfy slowy authored Aug 30, 2021
```
* fix: typo spelling grammar

* fix: make fixup
```
01977466

Add LayoutLMv2 + LayoutXLM (#12604) · b6ddb08a

NielsRogge authored Aug 30, 2021



* First commit

* Make style

* Fix dummy objects

* Add Detectron2 config

* Add LayoutLMv2 pooler

* More improvements, add documentation

* More improvements

* Add model tests

* Add clarification regarding image input

* Improve integration test

* Fix bug

* Fix another bug

* Fix another bug

* Fix another bug

* More improvements

* Make more tests pass

* Make more tests pass

* Improve integration test

* Remove gradient checkpointing and add head masking

* Add integration test

* Add LayoutLMv2ForSequenceClassification to the tests

* Add LayoutLMv2ForQuestionAnswering

* More improvements

* More improvements

* Small improvements

* Fix _LazyModule

* Fix fast tokenizer

* Move sync_batch_norm to a separate method

* Replace dummies by requires_backends

* Move calculation of visual bounding boxes to separate method + update README

* Add models to main init

* First draft

* More improvements

* More improvements

* More improvements

* More improvements

* More improvements

* Remove is_split_into_words

* More improvements

* Simply tesseract - no use of pandas anymore

* Add LayoutLMv2Processor

* Update is_pytesseract_available

* Fix bugs

* Improve feature extractor

* Fix bug

* Add print statement

* Add truncation of bounding boxes

* Add tests for LayoutLMv2FeatureExtractor and LayoutLMv2Tokenizer

* Improve tokenizer tests

* Make more tokenizer tests pass

* Make more tests pass, add integration tests

* Finish integration tests

* More improvements

* More improvements - update API of the tokenizer

* More improvements

* Remove support for VQA training

* Remove some files

* Improve feature extractor

* Improve documentation and one more tokenizer test

* Make quality and small docs improvements

* Add batched tests for LayoutLMv2Processor, remove fast tokenizer

* Add truncation of labels

* Apply suggestions from code review

* Improve processor tests

* Fix failing tests and add suggestion from code review

* Fix tokenizer test

* Add detectron2 CI job

* Simplify CI job

* Comment out non-detectron2 jobs and specify number of processes

* Add pip install torchvision

* Add durations to see which tests are slow

* Fix tokenizer test and make model tests smaller

* Frist draft

* Use setattr

* Possible fix

* Proposal with configuration

* First draft of fast tokenizer

* More improvements

* Enable fast tokenizer tests

* Make more tests pass

* Make more tests pass

* More improvements

* Addd padding to fast tokenizer

* Mkae more tests pass

* Make more tests pass

* Make all tests pass for fast tokenizer

* Make fast tokenizer support overflowing boxes and labels

* Add support for overflowing_labels to slow tokenizer

* Add support for fast tokenizer to the processor

* Update processor tests for both slow and fast tokenizers

* Add head models to model mappings

* Make style & quality

* Remove Detectron2 config file

* Add configurable option to label all subwords

* Fix test

* Skip visual segment embeddings in test

* Use ResNet-18 backbone in tests instead of ResNet-101

* Proposal

* Re-enable all jobs on CI

* Fix installation of tesseract

* Fix failing test

* Fix index table

* Add LayoutXLM doc page, first draft of code examples

* Improve documentation a lot

* Update expected boxes for Tesseract 4.0.0 beta

* Use offsets to create labels instead of checking if they start with ##

* Update expected boxes for Tesseract 4.1.1

* Fix conflict

* Make variable names cleaner, add docstring, add link to notebooks

* Revert "Fix conflict"

This reverts commit a9b46ce9afe47ebfcfe7b45e6a121d49e74ef2c5.

* Revert to make integration test pass

* Apply suggestions from @LysandreJik's review

* Address @patrickvonplaten's comments

* Remove fixtures DocVQA in favor of dataset on the hub
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

b6ddb08a

27 Aug, 2021 1 commit

Add Wav2Vec2 & Hubert ForSequenceClassification (#13153) · b6f332ec

Anton Lozhkov authored Aug 27, 2021

* Add hubert classifier + tests

* Add hubert classifier + tests

* Dummies for all classification tests

* Wav2Vec2 classifier + ER test

* Fix hubert integration tests

* Add hubert IC

* Pass tests for all classification tasks on Hubert

* Pass all tests + copies

* Move models to the SUPERB org

b6f332ec

26 Aug, 2021 1 commit

Add DINO conversion script (#13265) · 0759f251

NielsRogge authored Aug 26, 2021

* First commit

* Add interpolation of patch embeddings

* Comment out code

* Fix bug

* Fix another bug

* Fix bug

* Fix another bug

* Remove print statements

* Update conversion script

* Use the official vit implementation

* Add support for converting dino_vits8

* Add DINO to docs of ViT

* Remove assertion

* Add interpolation of position encodings

* Fix bug

* Add align_corners

* Add interpolate_pos_encoding option to forward pass of ViTModel

* Improve interpolate_pos_encoding method

* Add docstring

0759f251

23 Aug, 2021 1 commit

Make Flax GPT2 working with cross attention (#13008) · 2e20c0f3

Yih-Dar authored Aug 23, 2021



* make flax gpt2 working with cross attention

* Remove encoder->decoder projection layer

* A draft (incomplete) for FlaxEncoderDecoderModel

* Add the method from_encoder_decoder_pretrained + the docstrings

* Fix the mistakes of using EncoderDecoderModel

* Fix style

* Add FlaxEncoderDecoderModel to the library

* Fix cyclic imports

* Add FlaxEncoderDecoderModel to modeling_flax_auto.py

* Remove question comments

* add tests for FlaxEncoderDecoderModel

* add flax_encoder_decoder to the lists of ignored entries in check_repo.py

* fix missing required positional arguments

* Remove **kwargs when creating FlaxEncoderDecoderModel in from_encoder_decoder_pretrained()

Also fix generation eos/pad tokens issue

* Fix: Use sequences from the generated_output

* Change a check from assert to raise ValueError

* Fix examples and token ids issues

* Fix missing all_cross_attentions when outputting tuple in modeling_gpt2

* Remove the changes in configuration docstrings.

* allow for bert 2 gpt2

* make fix-copies

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Change remaining examples to bert2gpt2

* Change the test to Bert2GPT2

* Fix examples

* Fix import

* Fix unpack bug

* Rename to FlaxEncoderDecoderModelTest and change the test to bert2gpt2

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix: NotImplentedError -> NotImplementedError

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* up

* finalize
Co-authored-by: ydshieh <ydshieh@user.noreply>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

2e20c0f3

17 Aug, 2021 1 commit

Add splinter (#12955) · 439a43b6

Ori Ram authored Aug 17, 2021



* splinter template

* initialize splinter classes

* Splinter Tokenizer

* splinter.rst

* tokenization fixes

* Documentation & some minor variable name changes

* bug fix (added back question_token_id to config) + variable names

* Minor bug fixes + variable name changes

* Fix Splinter references after merge with new transformers

* changes after running make style & quality

* Fix documentation unindent

* Fix doc indentation in tokenization_splinter

* Fix also SplinterTokenizerFast

* Add Splinter to index.rst and README

* Fixdouble whitespace from index.rst

* Fixed index.rst with 'make fix-copies'

* Update docs/source/model_doc/splinter.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/splinter.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/splinter.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/splinter.rst
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/models/splinter/__init__.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Added "copied from BERT" comments

* Removing unnexessary code from modeling_splinter

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/splinter/configuration_splinter.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Remove references to TF modeling from splinter

* Update src/transformers/models/splinter/modeling_splinter.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove unnecessary check

* Update src/transformers/models/splinter/modeling_splinter.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add differences between Splinter and Bert tokenizers

* Update src/transformers/models/splinter/modeling_splinter.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/splinter/tokenization_splinter_fast.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove unnecessary check

* Doc formatting

* Update src/transformers/models/splinter/tokenization_splinter.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/splinter/tokenization_splinter.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* bug fix: remove load_tf_weights attribute

* Some minor quality changes

* Update docs/source/model_doc/splinter.rst
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/splinter/configuration_splinter.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Change FullyConnectedLayer to SplinterFullyConnectedLayer

* Variable naming

* Reove gather_positions function

* Remove ClassificationHead as it's outdated

* Update src/transformers/models/splinter/modeling_splinter.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Remove hardcoded 102 token id

* Minor style change

* Added "tau" organization to all model identifiers & URLS

* Added tau to the tests as well

* Copy-from comments

* Removed all unnecessary classes (e.g. SplinterForMaskedLM)

* Running make fix-copies

* Bug fix: Further removed unnecessary classes

* Add Splinter to AutoTokenization

* Add an integration test for Splinter

* Removed initialize_new_qass from config - It will be done through different checkpoints

* Removed `initialize_new_qass` from documentation as well

* Added new checkpoint names (`tau/splinter-base-qass` and same for large) in the code

* Minor change to test

* SplinterTokenizer now doesn't abstract from BertTokenizer

* SplinterTokenizerFast also dosn't abstract from Bert

* style and quality

* bug fix: import ing torch in tests only if it's available

* Auto mappings

* Changed copyrights in Splinter's files

* Update src/transformers/models/splinter/configuration_splinter.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: yuvalkirstain <kirstain.yuval@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

439a43b6

13 Aug, 2021 1 commit
- Fix VisualBERT docs (#13106) · bda1cb02
  Gunjan Chhablani authored Aug 13, 2021
```
* Fix VisualBERT docs

* Show example notebooks as lists

* Fix style
```
  bda1cb02
12 Aug, 2021 2 commits

Rely on huggingface_hub for common tools (#13100) · 9a498c37
Sylvain Gugger authored Aug 12, 2021
```
* Remove hf_api module and use hugginface_hub

* Style

* Fix to test_fetcher

* Quality
```
9a498c37

Deberta tf (#12972) · d329b633

Kamal Raj authored Aug 12, 2021



* TFDeberta

moved weights to build and fixed name scope

added missing ,

bug fixes to enable graph mode execution

updated setup.py

fixing typo

fix imports

embedding mask fix

added layer names avoid autmatic incremental names

+XSoftmax

cleanup

added names to layer

disable keras_serializable
Distangled attention output shape hidden_size==None
using symbolic inputs

test for Deberta tf

make style

Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

removed tensorflow-probability

removed blank line

* removed tf experimental api
+torch_gather tf implementation from @Rocketknight1

* layername DeBERTa --> deberta

* copyright fix

* added docs for TFDeberta & make style

* layer_name change to fix load from pt model

* layer_name change as pt model

* SequenceClassification layername change,
to same as pt model

* switched to keras built-in LayerNormalization

* added `TFDeberta` prefix most layer classes

* updated to tf.Tensor in the docstring

d329b633

09 Aug, 2021 1 commit
- replace tgt_lang by tgt_text (#13061) · 76cadb79
  SaulLu authored Aug 09, 2021
  
  76cadb79
04 Aug, 2021 3 commits

Add BEiT (#12994) · 83e5a106

NielsRogge authored Aug 04, 2021



* First pass

* Make conversion script work

* Improve conversion script

* Fix bug, conversion script working

* Improve conversion script, implement BEiTFeatureExtractor

* Make conversion script work based on URL

* Improve conversion script

* Add tests, add documentation

* Fix bug in conversion script

* Fix another bug

* Add support for converting masked image modeling model

* Add support for converting masked image modeling

* Fix bug

* Add print statement for debugging

* Fix another bug

* Make conversion script finally work for masked image modeling models

* Move id2label for datasets to JSON files on the hub

* Make sure id's are read in as integers

* Add integration tests

* Make style & quality

* Fix test, add BEiT to README

* Apply suggestions from @sgugger's review

* Apply suggestions from code review

* Make quality

* Replace nielsr by microsoft in tests, add docs

* Rename BEiT to Beit

* Minor fix

* Fix docs of BeitForMaskedImageModeling
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

83e5a106

[Flax] Correct flax docs (#12782) · fbf468b0

Patrick von Platen authored Aug 04, 2021

* fix_torch_device_generate_test

* remove @

* fix flax docs

* correct more docs in flax

* another correction

* fix flax docs

* Apply suggestions from code review

fbf468b0

[Flax] Correctly Add MT5 (#12988) · a317e6c3

Patrick von Platen authored Aug 04, 2021



* finish PR

* finish mt5

* push

* up

* Update tests/test_modeling_flax_mt5.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

a317e6c3