Commits · 56af8df35966780f413c0b407ef65af087a497cd · chenpangpang / transformers

07 Oct, 2022 7 commits

HF <-> megatron checkpoint reshaping and conversion for GPT (#19317) · 56af8df3

Sourab Mangrulkar authored Oct 07, 2022



* HF <-> megatron checkpoint conversion handling reshaping from different tensor and parallel sizes

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* addressing comments

* add doc strings and  🐛

 fixes
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

56af8df3

Added type hints for TF: TransfoXL (#19380) · 41ec5d0c

Thomas authored Oct 07, 2022

* Added type hints for TF: TransfoXL
* Added type hints for TF: TransfoXL

* Change type hints for training

* Change type hints for training

41ec5d0c

removes prophet config dependencies from xlm-prophet (#19400) · b29ebdf4
h authored Oct 07, 2022

b29ebdf4

add ONNX support for swin transformer (#19390) · e162cebf

Bibhabasu Mohapatra authored Oct 07, 2022



* swin transformer onnx support

* Updated image dimensions as dynamic
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

e162cebf

Added Type hints for XLM TF (#19333) · 969534af

IMvision12 authored Oct 07, 2022



* Update modeling_tf_xlm.py

* Updates

* Update src/transformers/models/xlm/modeling_tf_xlm.py

* Update src/transformers/models/xlm/modeling_tf_xlm.py

* Update src/transformers/models/xlm/modeling_tf_xlm.py

* Update src/transformers/models/xlm/modeling_tf_xlm.py

* Update src/transformers/models/xlm/modeling_tf_xlm.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

969534af

Fix gather for metrics (#19389) · 46fd04b4
Zachary Mueller authored Oct 07, 2022

46fd04b4

Making `ConvBert Tokenizer` independent from `bert Tokenizer` (#19347) · 7e348aac

IMvision12 authored Oct 07, 2022

* ConvBert

* added comment

* Updated

* Final_updates

* Update tokenization_convbert.py

* Update tokenization_convbert_fast.py

* Update tokenization_convbert.py

* Update tokenization_convbert.py

* Update tokenization_convbert_fast.py

* Update tokenization_convbert.py

* Update tokenization_convbert_fast.py

* Updates

* Updates

* Updated

* Final Updates

7e348aac

06 Oct, 2022 3 commits

fix docs example, add object_detection to DETR docs (#19377) · ae3e3bc6
Alara Dirik authored Oct 07, 2022

ae3e3bc6

Change link of repojacking vulnerable link (#19393) · ce262019

Ilaygoldman authored Oct 07, 2022

The link to https://github.com/vasudevgupta7/bigbird is vulnerable to repojacking (it redirects to the orignial project that changed name), you should change the link to the current name of the project. if you won't change the link, an attacker can open the linked repository and attacks users that trust your links

ce262019

🚨

Fix ViT parameter initialization (#19341) · f0b49015

Alara Dirik authored Oct 06, 2022

This PR aims to rectify the discrepancy between the training performances of HF and Timm ViT implementations.

- Initializes torch and flax ViT dense layer weights with trunc_normal instead of normal (consistent with the TF implementation.
- Initializes cls_token and positional_embeddings with trunc_normal
- Updates DeiT copy to reflect the changes

f0b49015

05 Oct, 2022 17 commits

Fix pipeline tests for Roberta-like tokenizers (#19365) · 7e7f62bf
Sylvain Gugger authored Oct 05, 2022
```
* Fix pipeline tests for Roberta-like tokenizers

* Fix fix
```
7e7f62bf

Fix DETR segmentation postprocessing output (#19363) · bad353ce

Alara Dirik authored Oct 06, 2022

Ensures post_process_instance_segmentation and post_process_panoptic_segmentation methods return a tensor of shape (target_height, target_width) filled with -1 values if no segment with score > threshold is found.

bad353ce

Add WhisperModel to transformers (#19166) · 45e14038

Arthur authored Oct 05, 2022



* simplify loop

* add featur extractor

* add model

* start conversion

* add dropout

* initial commit of test files

* copnversion for all models

* update processor for correct padding

* update feature extraction

* update integration test logits match

* fmnt: off for the logits

* on the fly mel bank

* small nit

* update test

* update tokenizer

* nit feature extraction

* update

* update tokenizer test

* adds logit processor and update tokenizer to get supress tokens

* style

* clean convert

* revert to original modeling tf utils

* Update

* update

* nit

* clean convert file

* update tests and nits

* quality

* slow generation test

* ffn_dim to allow customization

* update readme

* add to toctreee

* start fixing integration tests

* update tests and code

* fix feature extractor

* fix config tests common

* update code to fix tests

* fix feature exctractor

* nit feature extraction

* update test for new feature extractor

* style

* add absrtact

* large logits wioth custom decoder input ids

* wraap around is otrch available

* fix feature extractor

* correct logits for whisper small.en

* nit

* fix encoder_attentino_mask

* some fixes

* remove unnecessary inputs

* nits

* add normalizer file

* update etst tokenization

* fix attention mask not defined

* Add model to README

* Fix doc tests

* fix generate

* remove uncoder attention mask useless

* update test modeling whisper

* update condfig to add second non supress tokens

* nits on feature exrtactor

* nit for test tokenizers

* update etsts

* update tests

* update tokenization test

* fixup

* invalidated hf token. Clean convert openai to whisper

* fix logit tests

* fixup

* clean merge

* revert toc_tree changes

* remove useless LogitProcessor

* Update whisper .mdx

* update config file doc

* update configuration docstring

* update test tokenization

* update test tokenization

* update tokenization whisper
Added copied from where needed

* update feature extraction

* nit test name

* style

* quality

* remove get suppress tokens and update non_speech tokens global variables

* Update src/transformers/models/whisper/feature_extraction_whisper.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* clean modeling whisper and test
Removed the attention mask arguments that are deprecated

* fix large test

* Add multilingual audio test, and translate test

* style

* fix larg multilingual test

* nits

* Update docs/source/en/model_doc/whisper.mdx
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add copied from for attention layer

* remove attention masks in doc

* add english normalizer

* update tokenization test

* remove copied from in whisper attention : no bias in k_proj only

* wrap around dependencies in english normalizer

* style

* correct import generation logits

* for now, wrap feature extractor with torch

* Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/whisper.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* remove torch depencies for feature extraction and style

* fixup

* nit

* update logitds

* style

* nit

* nits and fix final tests

* add `is_more_itertools_available` to utils

* quality

* add begin supress tokens, supress tokens to generate args and config

* clean supressTokensLogitProcessor in generation logits

* Nit naming

* add supressTokensAtBegin

* udpate tests, supress tokens to None or correct values

* nit and style

* update RAG to fit test and generate_logit

* add copy pasted statment on english normalizer

* add arguments to config_common_kwargs

* Update src/transformers/generation_utils.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/generation_logits_process.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* revert changes based on reviews

* update doc and nits

* more nits

* last nits

* update test configuration common

* add BART name in decoder attention mask documentation

* Update src/transformers/models/whisper/modeling_whisper.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* style

* nit

* nit

* add english.json file to git

* nits on documentation

* nit

* nits

* last styling

* add main toctree file

* remove sentence piece dependency

* clean init file

* fix tokenizer that has no dependencies on sentencepiece

* update whisper init file, nit

* remove english.json file

* add get decoder prompt id

* revert changes and add forced logit processor

* nit

* clean normalizer

* remove protected

* update

* Update src/transformers/models/whisper/configuration_whisper.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update based on review

* Update src/transformers/models/whisper/configuration_whisper.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add batched tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <niels.rogge1@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

45e14038

Fix MaskFormer failing postprocess tests (#19354) · 7598791c

Alara Dirik authored Oct 05, 2022

Ensures post_process_instance_segmentation and post_process_panoptic_segmentation methods return a tensor of shape (target_height, target_width) filled with -1 values if no segment with score > threshold is found.

7598791c

Fix gather for metrics (#19360) · ad98642a
Zachary Mueller authored Oct 05, 2022

ad98642a

Removes Roberta and Bert config dependencies from Longformer (#19343) · d9101b71

Harsha authored Oct 05, 2022

* removes roberta and bert config dependencies from longformer

* adds copied from statements

* fixes style

* removes excessive comments and replace bert with longformer in a couple places

* fixes style

d9101b71

correct typos in README (#19304) · 226b8ef0
Paula Isabel authored Oct 06, 2022

226b8ef0

Call _set_save_spec() when creating TF models (#19321) · 071df6eb

Matt authored Oct 05, 2022



* Add a build_from_serving_sig_and_dummies method and replace all calls like model(model.dummy_inputs) with it.

* make fixup

* Remove the overridden save() as this is no longer necessary

* Also call _set_save_spec(), the last missing piece

* Ensure we set the save spec when loading from config too

* Turn this whole thing into a one-line PR

* Turn this whole thing into a one-line PR

* Turn this whole thing into a one-line PR
Co-authored-by: Your Name <you@example.com>

071df6eb

Test failing test while we resolve the issue. (#19355) · c875a96e
Sylvain Gugger authored Oct 05, 2022

c875a96e

Change `BloomConfig` docstring (#19336) · 4cbc797b

Younes Belkada authored Oct 05, 2022



* change `BloomConfig` docstring

- slightly change the docstring of the `BloomConfig`
- Use correct default vocab size
- Use correct default `hidden_dim`, `n_head`

* Update src/transformers/models/bloom/configuration_bloom.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/bloom/configuration_bloom.py
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

* make style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

4cbc797b

Frees LongformerTokenizer of the Roberta dependency (#19346) · e794ca5b

Harsha authored Oct 05, 2022

* copies over roberta tokenizer to longformertokenizer since they are both identical

* adds Copied from patterns to pass copy check

e794ca5b

Add sudachi and jumanpp tokenizers for bert_japanese (#19043) · 2f53ab57

r-terada authored Oct 06, 2022

* add sudachipy and jumanpp tokenizers for bert_japanese

* use ImportError instead of ModuleNotFoundError in SudachiTokenizer and JumanppTokenizer

* put test cases of test_tokenization_bert_japanese in one line

* add require_sudachi and require_jumanpp decorator for testing

* add sudachi and pyknp(jumanpp) to dependencies

* remove sudachi_dict_small and sudachi_dict_full from dependencies

* empty commit for ci

2f53ab57

Making camembert independent from roberta, clean (#19337) · 60db81ff
mustapha ajeghrir authored Oct 05, 2022
```
Co-authored-by: Mustapha AJEGHRIR <mustapha.ajeghrir@kleegroup.com>
```
60db81ff

[WIP]remove XLMTokenizer inheritance from FlaubertTokenizer (#19330) · c54bb1ad

Druhin Abrol authored Oct 05, 2022

* remove XLMTokenizer inheritance from FlaubertTokenizer

* remove XLMTokenizer inheritance from FlaubertTokenizer

* remove XLMTokenizer inheritance from FlaubertTokenizer

* remove XLMTokenizer inheritance from FlaubertTokenizer: fixed styling

* removed repo-consistensy issue

c54bb1ad

Remove bert interdependency from clip tokenizer (#19332) · e12bbe3b
Shyam Sudhakaran authored Oct 05, 2022

e12bbe3b
Removed interdependency of BERT's Tokenizer in tokenization of prophetnet (#19331) · 512fa41c
Divyanshu Kumar authored Oct 05, 2022
```
* removed interdependency of BERTTokenizer in tokenization of prophetnet

* fix: style
```
512fa41c

Maskformer post-processing fixes and improvements (#19172) · 07e94bf1

Alara Dirik authored Oct 05, 2022

- Improves MaskFormer docs, corrects minor typos
- Restructures MaskFormerFeatureExtractor.post_process_panoptic_segmentation for better readability, adds target_sizes argument for optional resizing
- Adds post_process_semantic_segmentation and post_process_instance_segmentation methods.
- Adds a deprecation warning to post_process_segmentation method in favour of post_process_instance_segmentation

07e94bf1

04 Oct, 2022 13 commits

removing XLMConfig inheritance from FlaubertConfig (#19326) · 6268694e

Druhin Abrol authored Oct 05, 2022



* removing XLMConfig inheritance from FlaubertConfig

* removing XLMConfig inheritance from FlaubertConfig

* Fixed styling issue

* Update configuration_flaubert.py
Co-authored-by: Druhin Abrol <druhinabrol@192.168.1.6>

6268694e

Remove interdependency from OpenAI tokenizer (#19327) · bf7eb0c9
Erin authored Oct 04, 2022
```
* Remove interdependency from OpenAI tokenizer

* Adjust import order for linter
```
bf7eb0c9

Clamping hidden state values to allow FP16 (#19229) · 971da2e6

Samuel Arcadinho authored Oct 04, 2022



* Clamping hidden state values to allow FP16

* Reformating

* Adding missing if condition

* Update src/transformers/models/longt5/modeling_longt5.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/longt5/modeling_longt5.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/longt5/modeling_longt5.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Formating file
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

971da2e6

Add `BloomForQuestionAnswering` (#19310) · 587d84b1

Younes Belkada authored Oct 04, 2022



* add bloom for question answering

- attempt to add Bloom for question answering
- adapted from `GPTJForQuestionAnswering`
- Fixed `num_labels` to `2` for common tests
- Added a bit of docstring
- All common tests pass

* Update src/transformers/models/bloom/modeling_bloom.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert changes related to `num_labels`
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

587d84b1

docker-build: Update actions/checkout to v3 (#19288) · 6dce9e0c
Sushrut1101 authored Oct 04, 2022

6dce9e0c
Removing BertConfig inheritance from LayoutLMConfig (#19307) · 6fd254a3
Arnaud Stiegler authored Oct 04, 2022
```
* removing BertConfig inheritance

* fix missing arguments
```
6fd254a3
wrap forward passes with torch.no_grad() (#19273) · a9782881
Partho authored Oct 04, 2022

a9782881
wrap forward passes with torch.no_grad() (#19274) · d6e92044
Partho authored Oct 04, 2022

d6e92044
wrap forward passes with torch.no_grad() (#19278) · 2403dbd6
Partho authored Oct 04, 2022

2403dbd6
wrap forward passes with torch.no_grad() (#19279) · f134d385
Partho authored Oct 04, 2022

f134d385
ci(workflows): update actions/checkout to v3 (#19280) · cd024da6
Oscar Dominguez authored Oct 04, 2022
```
in stale.yml
```
cd024da6
ci(stale.yml): upgrade actions/setup-python to v4 (#19281) · ca3ebc44
Oscar Dominguez authored Oct 04, 2022

ca3ebc44
alter retrived to retrieved (#18863) · cc263e9b
gouqi_nju authored Oct 04, 2022

cc263e9b