Commits · 68a894a5875bfd958b8254afd3bbb23db9c2e813 · chenpangpang / transformers

02 Aug, 2022 1 commit

Fix uninitialized parameter in conformer relative attention. (#18368) · 68a894a5

Piotr Dabkowski authored Aug 02, 2022

`torch.Tensor` creates an unitialized tensor (as via `torch.empty`), this leads to undeterministic behavior, poor initialization, and nans if you have unlucky init. The paper does not specify the initialization for bias terms, so I guess zero seems like a good choice - no bias initially. `torch.Tensor` is usually populated with zeros, so this fix will be close to the intended behavior:

```
>>> torch.Tensor(100, 100).sum()
tensor(0.)
>>> torch.Tensor(100, 100).sum()
tensor(nan)
>>> torch.Tensor(100, 100).sum()
tensor(0.)
```

68a894a5

01 Aug, 2022 19 commits

fix: create a copy for tokenizer object (#18408) · df5e4232
Yassine authored Aug 01, 2022

df5e4232

Layoutlmv2 tesseractconfig (#17733) · 24845aeb

Kelvin Kong authored Aug 02, 2022



* Added option for users to modify config parameter used by pytesseract during feature extraction

- Added optional 'tess_config' kwarg when setting up LayoutLMV2 processor that is used by pytesseract during feature extraction
- Eg. Can be used to modify psm values by setting tess_config to '--psm 7'
- Different psm values significantly influences the output of layoutlmv2

* Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Updated variable names to be more explicit

* Fixed styles

* Added option for users to modify config parameter when calling pytesseract during feature extraction

- Added option to set "tesseract_config" parameter during LayoutLMV3 processor initialization
- Can be used to modify PSM values, eg. by setting tesseract_config="--psm 6"

* Removed  from function signature
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

24845aeb

Split model list on modality (#18328) · 151a2aaa

Steven Liu authored Aug 01, 2022

* 📝

 split up model list

* Adapt script to reorg

* apply niels feedback
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

151a2aaa

Rewrite push_to_hub to use upload_files (#18366) · 01db72ab

Sylvain Gugger authored Aug 01, 2022

* Rewrite push_to_hub to use upload_files

* Adapt the doc a bit

* Address review comments and clean doc

01db72ab

Add Flax BART pretraining script (#18297) · 3909d7f1

Duong A. Nguyen authored Aug 01, 2022



* add bart pretraining flax script

* fixup

* add bart pretraining flax script

* add BART to README

* add BART to README

* add BART to README

* add BART to README

* add BART to README

* add bos eos document

* Update README.md

* Update README.md

* Update examples/flax/language-modeling/run_bart_dlm_flax.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* final

* final

* final

* remove use_auth_token ing from_config
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

3909d7f1

Fix ROUGE add example check and update README (#18398) · 941d2331
Sylvain Gugger authored Aug 01, 2022
```
* Fix ROUGE add example check and update README

* Stay consistent in values
```
941d2331

Adding fine-tuning models to LUKE (#18353) · 62098b93

Ikuya Yamada authored Aug 02, 2022

* add LUKE models for downstream tasks

* add new LUKE models to docs

* fix typos

* remove commented lines

* exclude None items from tuple return values

62098b93

Fix docs (#18399) · 7b9e995b

NielsRogge authored Aug 01, 2022


Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

7b9e995b

Add balanced strategies for device_map in from_pretrained (#18349) · e0bc4c73

Sylvain Gugger authored Aug 01, 2022



* Add balanced strategies for device_map in from_pretrained

* Add safeguards for Accelerate version

* Update src/transformers/modeling_utils.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Style
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

e0bc4c73

Fix doc tests (#18397) · 39e76d76
NielsRogge authored Aug 01, 2022
```
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
```
39e76d76
Fix OPT doc tests (#18365) · 11413711
Arthur authored Aug 01, 2022

11413711
Add evaluate to test dependencies (#18396) · af1e6b4d
Sylvain Gugger authored Aug 01, 2022

af1e6b4d
Add a check regarding the number of occurrences of ``` (#18389) · bd6d1b43
Yih-Dar authored Aug 01, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
bd6d1b43

Fix from_pretrained kwargs passing (#18387) · 1cd7c6f1

YouJiacheng authored Aug 01, 2022

Fix #18385
I don't know whether `use_auth_token`, `cache_dir` and `local_files_only` should be passed to `(cls.slow_tokenizer_class)._from_pretrained`, but I guess it should.

1cd7c6f1

Remove pt-like calls on tf tensor (#18393) · 96b5d7db
amyeroberts authored Aug 01, 2022

96b5d7db
Correct the spelling of bleu metric (#18375) · 679d68a1
Ogundepo Odunayo authored Aug 01, 2022

679d68a1
Migrate metric to Evaluate in Pytorch examples (#18369) · 1f843991
atturaioe authored Aug 01, 2022
```
* Migrate metric to Evaluate in pytorch examples

* Remove unused imports
```
1f843991

Bump mistune from 0.8.4 to 2.0.3 in /examples/research_projects/lxmert (#18370) · 25ec12ea

dependabot[bot] authored Aug 01, 2022

Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3.
- [Release notes](https://github.com/lepture/mistune/releases)
- [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst)
- [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3

)

---
updated-dependencies:
- dependency-name: mistune
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

25ec12ea

Bump mistune in /examples/research_projects/visual_bert (#18371) · a7360385

dependabot[bot] authored Aug 01, 2022

Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3.
- [Release notes](https://github.com/lepture/mistune/releases)
- [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst)
- [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3

)

---
updated-dependencies:
- dependency-name: mistune
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

a7360385

30 Jul, 2022 1 commit
- fix FSDP ShardedGradScaler (#18358) · b2e4b091
  Sourab Mangrulkar authored Jul 30, 2022
```
renaming it
```
  b2e4b091
29 Jul, 2022 6 commits

Fix TFSegformerForSemanticSegmentation doctest (#18362) · 51227e26
Yih-Dar authored Jul 29, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
51227e26

[FX] Symbolic trace for Bloom (#18356) · 4e2f4a92

Michael Benayoun authored Jul 29, 2022

* Bloom model can now be traced

* Bloom traced model can be torch scripted and serialized

* Bloom can be traced with variable keyword arguments

* Enable XLNet support

* Disable XLNet for now

4e2f4a92

Fix some doctests (#18359) · 1763770b

Yih-Dar authored Jul 29, 2022



* Fix some doctests
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

1763770b

Replace `as_target` context managers by direct calls (#18325) · 986526a0

Sylvain Gugger authored Jul 29, 2022



* Preliminary work on tokenizers

* Quality + fix tests

* Treat processors

* Fix pad

* Remove all uses of  in tests, docs and examples

* Replace all as_target_tokenizer

* Fix tests

* Fix quality

* Update examples/flax/image-captioning/run_image_captioning_flax.py
Co-authored-by: amyeroberts <amy@huggingface.co>

* Style
Co-authored-by: amyeroberts <amy@huggingface.co>

986526a0

Fix OwlViT torchscript tests (#18347) · a64bcb56
Yih-Dar authored Jul 29, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
a64bcb56
[Docs] Fix Speech Encoder Decoder doc sample (#18346) · a4ee463d
Sanchit Gandhi authored Jul 29, 2022
```
* [Docs] Fix Speech Encoder Decoder doc sample

* improve pre-processing comment

* make style
```
a4ee463d

28 Jul, 2022 10 commits

Migrate metrics used in flax examples to Evaluate (#18348) · da503ea0

Vijay S Kalmath authored Jul 28, 2022

Currently, tensorflow examples use the `load_metric` function from
Datasets library, commit migrates function call to `load` function
from Evaluate library.

da503ea0

Migrate metric to Evaluate library for tensorflow examples (#18327) · a2586795

Vijay S Kalmath authored Jul 28, 2022

* Migrate metric to Evaluate library in tf examples

Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.

Fix for #18306

* Migrate metric to Evaluate library in tf examples

Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.

Fix for #18306

* Migrate `metric` to Evaluate for all tf examples

Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.

a2586795

[BLOOM] Deprecate `position_ids` (#18342) · 7b090876
Thomas Wang authored Jul 28, 2022

7b090876
Include tensorflow-aarch64 as a candidate (#18345) · 9c336657
Ankur Goyal authored Jul 28, 2022
```
Co-authored-by: Ankur Goyal <ankur@impira.com>
```
9c336657
Remove Flax OPT from doctest for now (#18338) · b53dab60
Yih-Dar authored Jul 28, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
b53dab60
Fix codeparrot deduplication - ignore whitespaces (#18023) · 286a18fa
Loubna Ben Allal authored Jul 28, 2022
```
* ignore whitspaces for hash

* reformat code

* Update README.md
```
286a18fa
Update automatic_speech_recognition.py (#18339) · 5d1fed07
bhuang authored Jul 28, 2022

5d1fed07
Updated _toctree.yml (#18337) · 985c7e3a
Nicola Procopio authored Jul 28, 2022

985c7e3a

updated translation (#18333) · a8e27957

Edoardo Federici authored Jul 28, 2022

Left the term fine-tuning since there is no correct translation into Italian and the English term is generally used. The same was done with some terms like "learning rate"

a8e27957

fixed typo (#18331) · 1e380c7d
Edoardo Federici authored Jul 28, 2022

1e380c7d

27 Jul, 2022 3 commits

Update feature extractor docs (#18324) · 96be1b7f

Steven Liu authored Jul 27, 2022

As pointed out by @NielsRogge, a feature extractor is used to prepare inputs for a model with a single modality rather than multimodal models.

96be1b7f

start from 1.12, torch_ccl is renamed as oneccl_bindings_for_pytorch … (#18229) · 2b81f72b

Wang, Yi authored Jul 27, 2022



* start from 1.12, torch_ccl is renamed as oneccl_bindings_for_pytorch and should import it before use
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* add doc for perf_train_cpu_many
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* update doc
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

2b81f72b

Add swin transformer v2 (#17469) · e87ac9d1

Ritik Nandwal authored Jul 27, 2022



* Add files generated using transformer-cli add-new-model-like command

* Add changes for swinv2 attention and forward method

* Add fixes

* Add modifications for weight conversion and remaining args in swin model

* Add changes for patchmerging

* Add changes for SwinV2selfattention

* Update conversion script

* Add final fixes for the swin_v2 model

* Add changes for conversion script for pretrained window size case

* Add pretrained window size value from config in SwinV2Encoder class

* Make fixup

* Add swinv2 to models_not_in_readme to utils/check_copies.py

* Modify Swinv2v2 to Swin Transformer V2

* Remove copied from, to run make fixup command

* Add updates to swinv2tf from main branch

* Add pretrained_window_size to config, to make tests pass

* Add modified weights from nandwalritik profile for swinv2

* Update model weights from swinv2 from nandwalritik profile

* Add fix for build_pr_documentation CI fix

* Add fixes for weight conversion

* Add change to make input with padding work

* Add fixes for test cases

* Add few changes from swin to swinv2 to pass test cases

* Remove tests for tensorflow as swinv2 for TF is not added yet

* Overide test_pt_tf_model_equivalence function as TF implementation for swinv2 is not added yet

* Add modeling_tf_swinv2 to _ignore_modules as test file is removed for this one right now.

* Update docs url for swinv2 in README.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Undo changes for check_repo

* Update url in readme.md

* Remove overrided function to test pt_tf_model_equivalence

* Remove TF model imports for Swinv2 as its not implemented in this PR

* Add changes for index.mdx

* Add swinv2 papers link,abstract and contributors details

* Rename cpb_mlp to continous_position_bias_mlp

* Add tips for swinv2 model

* Update src/transformers/models/swinv2/configuration_swinv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/swinv2/configuration_swinv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix indentation for docstring example in src/transformers/models/swinv2/configuration_swinv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update import order in src/transformers/models/swinv2/configuration_swinv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add copyright statements in weights conversion script.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Remove Swinv2 from models_not_in_readme

* Reformat code

* Remove TF implementation file for swinv2

* Update start docstring.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add changes for docstring

* Update orgname for weights to microsoft

* Remove to_2tuple function

* Add copied from statements wherever applicable

* Add copied from to Swinv2ForMaskedImageModelling class

* Reformat code.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add unittest.skip(with reason.) for test_inputs_embeds test case.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add updates for test_modeling_swinv2.py

* Add @unittest.skip() annotation for clarity to create_and_test_config_common_properties function

* Add continuous_position_bias_mlp parameter to conversion script

* Add test for testing masked_image_modelling for swinv2

* Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add suggested changes

* Add copied from to forward methods of Swinv2Stage and Swinv2Encoder

* Add push_to_hub flag to weight conversion script

* Change order or Swinv2DropPath class

* Add id2label mapping for imagenet 21k

* Add updated url for SwinV2 functions and classes used in implementation

* Update input_feature dimensions format, mentioned in comments.
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* Add suggested changes for modeling_swin2.py

* Update docs

* Remove create_and_test_config_common_properties function, as test_model_common_attributes is sufficient.

* Fix indentation.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add changes for making Nit objects in code style

* Add suggested changes

* Add suggested changes for test_modelling_swinv2

* make fix-copies

* Update docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

e87ac9d1