Commits · 1690094bdb8f606685e46f26c17ddadb925d0a20 · chenpangpang / transformers

13 Jun, 2022 2 commits

Update modeling_gpt_neox.py (#17575) · 54833886

Will Frey authored Jun 13, 2022

I'm guessing that the intention was to have the `_no_split_modules` class attribute for `GPTNeoXPreTrainedModel` to be set to `["GPTNeoXLayer"]`, akin to how its set as `["GPTJBlock"]` for `GPTJPreTrainedModel`.

If this is incorrect, please feel free to just close the PR.

Thanks!

54833886

Add Visual Question Answering (VQA) pipeline (#17286) · 66336dc1

Sijun He authored Jun 13, 2022



* wip

* rebase

* all tests pass

* rebase

* ready for PR

* address comments

* fix styles

* add require_torch to pipeline test

* remove remote image to improve CI consistency

* address comments; fix tf/flax tests

* address comments; fix tf/flax tests

* fix tests; add alias

* repo consistency tests

* Update src/transformers/pipelines/visual_question_answering.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* address comments

* Update src/transformers/pipelines/visual_question_answering.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* merge

* Update src/transformers/models/auto/modeling_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* merge
Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

66336dc1

10 Jun, 2022 2 commits
- Fix style · cdaed367
  Lysandre authored Jun 10, 2022
  
  cdaed367
- Move Clip image utils to image_utils.py (#17628) · 6e93d947
  Alara Dirik authored Jun 10, 2022
```
* move clip image utils to image_utils.py

* dont default to square images

* fix typo, revert change to test file

* edit convert_rgb comments
```
  6e93d947
09 Jun, 2022 5 commits

convert assertion to raised exception in debertav2 (#17619) · fba0b6a8

mrbean authored Jun 09, 2022

* convert assertion to raised exception in debertav2

* change assert to raise exception in deberta

* fix messages

fba0b6a8

Use shape_list to safely get shapes for Swin (#17591) · 9fc34235

amyeroberts authored Jun 09, 2022

* Use shape_list to safely get shapes

* Add relevant test

* Tidy and add metrics

* Resolve dynamic shaping issues and move test

* Tidy up and all samples in batch

* Formatting

9fc34235

Add ONNX support for ConvNeXT (#17627) · e0be053e
regisss authored Jun 09, 2022

e0be053e
Add ONNX support for ResNet (#17585) · 5323094a
regisss authored Jun 09, 2022
```
* Add ONNX support for ResNet

* Add ONNX test

* make fix-copies
```
5323094a

BLOOM (#17474) · ca2a55e9

Younes Belkada authored Jun 09, 2022



* adding template

* update model

* model update

* update conf for debug model

* update conversion

* update conversion script

* update conversion script

* fix missing keys check

* add tests to test the tokenizer in the local machine

* Change variable name

* add tests on xnli dataset

* add more description

* add descriptions + clearer code

* clearer code

* adding new tests + skipping few tests because of env problems

* change comment

* add dtype on the configuration

* add test embeddings

* add hardcoded test

* fix dtype issue

* adding torch.float16 to config

* adding more metrics (min, max, mean)

* add sum

* now the test passes with almost equal

* add files for conversion - test passes on cpu  gpu

* add final changes

* cleaning code

* add new args in the docstring

* fix one liner function

* remove macros

* remove forward attention

* clean up init funtion

* add comments on the issue

* rm scale mask softmax

* do make style

* fix dtype in init

* fixing for loop on att probs

* fix style with black

* fix style + doc error

* fix and debug CI errors (docs + style)

* some updates

- change new operations
- finally add scaled softmax
- added new args in the config

* make use cache working

* add changes

- save sharded models
- final changes on the modeling script

* add changes

- comment on alibi
- add TODO on seq length

* test commit

- added a text to test the commit
Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>

* final changes

- attention mask change
- generation works on BS176b
Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>

* changes - model + conversion

* move to correct dir

* put ,

* fex fixes

* fix tokenizer autodoc

* fix minor CI issues

* fix minor CI issues

* fix minor CI issues

* fix style issue

* fix minor import issues

* fix few issues

* remove def main on the test

* add require torch

* replace decorator with 'with'

* fix style

* change to bloom

* add quick fix tokenizer

* fix tokenizer file

* fix tokenizer

- merge tests
- small fixes

* fix import issue

* add bloom to readme

* fix consistency

* Update docs/source/en/model_doc/bloom.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

fix comment issues on file headers
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix doc issue

* small fix - modeling test

* some changes

- refactor some code
- taking into account reviews
- more tests should pass
- removed pruning tests

* remove useless division

* more tests should pass

* more tests should pass

* more tests should pass

* let's try this one

-add alibi offset
- remove all permutes to make the grad operations work
- finger crossed

* refactor

- refactor code
- style changes
- add new threshold for test

* major changes

- change BLOOM to Bloom
- add quick doc on bloom.mdx
- move embeddings test on modeling test

* modify readme

* small fixes

* small fix

- better threshold for a test

* remove old test file from fetcher

* fix small typo

* major change

- change BloomLMHead to BloomForCausalLM

* remove onnx config

* major changes

- refactor the code
- remove asserts
- change tol for test

* make style

* small change

* adding a slow test + commenting old ones for now

* make style

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make style

* fix duplicates

* cleaning comments on config

* clean a bit conversion file

* refacor a bit modeling file

* refactor tokenizer file

* fix tokenization test issue

* fix tokenization issue #2

* fix tokenization issue second try

* fix test issue

* make style + add suggestions

* change test fetcher

* try this one

- slow tests should pass
- finger crossed

* possible final changes

* make style

* try fix padding side issue

* fix side

* fix padding issue

* fix ko-readme

* fix config auto

* cleaning modeling file

* keep bloom in caps in ko

* update config docs

* remove pretraining_pp

* remove model parallel

* update config

- add correct config files

* fix duplicates

* fix fetcher

* fix refactor issue

- remove divide function

* try to remove alibi

* small fixes

- fix alibi
- remove seq length
- refactor a bit the code

* put correct values

- fix bos and eos token ids

* fix attention mask loop
Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>

* small fixes:

- remove skip bias add

* small fixes

- fix typo in readme
- fix typos in config

* small changes

- remove a test
- add reconstruction test
- change config

* small changes

- change Scaled Softmax to BloomScaledSoftmax

* small fixes

- fix alibi dtype

* major changes

- removing explicit dtype when loading modules
- fixing test args (torch_dtype=auto)
- add dosctring

* fix readmes

* major changes

- now bloom supports alibi shifting
- refactor a bit the code
- better test tolerance now

* refactor a bit

* refactor a bit

* put correct name on test

* change docstring

* small changes

- fix docstring modeling
- fix test tolerance

* fix small nit

- take dtype from tensors in the conversion script

* minor fix

- fix mdx issue

* minor fix

- change config docstring

* forward contrib credits from PR14084

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* apply modifications
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* resolve softmax upcast

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/models/bloom/modeling_bloom.py
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

* final changes modeling
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Merge commit 'd156898f

'

* merge commit

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* apply suggestions

Apply suggestions from Stas comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix gradient checkpointing
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add slow but exact

* add accelerate compatibility
Co-authored-by: Nicolas Patry <Narsil@users.noreply.github.com>

* forward contrib credits
Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com>
Co-authored-by: sgugger <sgugger@users.noreply.github.com>
Co-authored-by: patrickvonplaten <patrickvonplaten@users.noreply.github.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: LysandreJik <LysandreJik@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix torch device on tests

* make style

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix nits

Co-authored-by: patrickvonplaten<patrickvonplaten@users.noreply.github.com>

* remove final nits

* fix doc

- add more details on the doc
- add links to checkpoints

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/bloom/modeling_bloom.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions
Co-authored-by: sgugger <sgugger@users.noreply.github.com>

* put test torchscript to false

* Update src/transformers/models/bloom/modeling_bloom.py
Co-authored-by: justheuristic <justheuristic@gmail.com>

* fix alibi

- create alibi only once

* add small doc

* make quality

* replace torch.nn

* remove token type emb

* fix fused op + output bias

* add fused op

- now can control fused operation from config

* remove fused op

* make quality

* small changes

- remove unsed args on config
- removed bias gelu file
- make the model torchscriptable
- add torchscript slow tests

* Update src/transformers/models/bloom/modeling_bloom.py

* fix slow

* make style

* add accelerate support

* add bloom to deepspeed tests

* minor changes

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* minor change

* slow tests pass

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/model_doc/bloom.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* minor changes:

- change docstring
- add link to paper
Co-authored-by: Thomwolf <thomwolf@gmail.com>
Co-authored-by: Thomas Wolf <thomas@huggingface.co>
Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: sIncerass <sheng.s@berkeley.edu>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: Nicolas Patry <Narsil@users.noreply.github.com>
Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com>
Co-authored-by: sgugger <sgugger@users.noreply.github.com>
Co-authored-by: patrickvonplaten <patrickvonplaten@users.noreply.github.com>
Co-authored-by: LysandreJik <LysandreJik@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: justheuristic <justheuristic@gmail.com>
Co-authored-by: Stas Bekman <stas@stason.org>

ca2a55e9

08 Jun, 2022 2 commits

TF: Merge PT and TF behavior for Bart when no decoder_input_ids are passed (#17593) · e9d51387
Joao Gante authored Jun 08, 2022
```
* Merge PT and TF behavior
```
e9d51387

Add TFData2VecVision for semantic segmentation (#17271) · 9d99489f

Sayak Paul authored Jun 08, 2022



* feat: initial implementation of data2vec segmentation model in TF.

* chore: minor corrections to make the segmenter work.

* chore: removed unncessary files.

* chore: add tests and other modifications.

* fix: loss computation for segmentation.

* chore: remove unused variable.

* chore: formatting.

* added a dummy adaptive pooling layer.

* removed unnecessary file.

* potentially add identifiers to layer names.

* fix: layer naming.

* chore: removed unnecessary print.

* Skipping unneeded test

* chore: add logging to debug tolerance.

* fix: segmentation tests for tfdata2vecvision

* chore: make style.

* fix: layer names, assertion to be resolved.

* Bumping test tolerance a bit

* chore: bump the tol in PT test.
Co-authored-by: matt <rocketknight1@gmail.com>

9d99489f

07 Jun, 2022 3 commits

fix (#17589) · c6cea5a7

Yih-Dar authored Jun 08, 2022


Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

c6cea5a7

M-CTC-T Model (#16402) · 119e3c0f

Chan Woo Kim authored Jun 08, 2022



* added cbs to notebooks, made copy-paste error fix in generation_utils

* initial push for mctc model

* mctc feature extractor done

* added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.

* added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.

* passing attention, now struggling to figure out how attention masks make sense here

* works when excluding attention masks. ask later how one would integrate attention maskshere

* bizarre configuration error (model prefix comes first in config dict json and messes up the order)

* all passing but bizzarre config dict ordering issue when to_dict

* passing all major tests

* feature extraction, processor, tokenizer added & tests passing

* style & consistency & other logistical fixes

* copy paste fix

* model after feature extraction working

* commiting final feature extraction results; need to fix normalization

* feature extraction passing tests; probably should add tests on the specific flashlight-copied functions?

* delete print ; format code a bit

* fixing tests

* passing major tests

* fixing styles

* completed tokenization test with real example; not sure if these values are entirely correct.

* last test fixes from local

* reverting accidentally included custom setup configs

* remove load tf weights; fix config error

* testing couldnt import featureextractor

* fix docs

* fix docs

* resolving comments

* style fixes

* style fixes

* Update to MCTCConv1dSubSampler
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* relposemb fixes

* conv1d name issue; expecting config fail with paraentheses

* fix config issue

* fix config issue

* fix config issue

* change everything to MCTCT

* fixing naming change errors

* archive list

* copyrights and docs

* copyrights and docs

* copyrights and docs

* merge resolution

* move tests, fix to changed optionaldependency structure

* test directories changed

* fixing tests

* how to avoid tf tests?

* how to avoid tf tests?

* tests passing locally

* allow mctctprocessor imported any env

* allow mctctprocessor imported any env

* fixed second round of feedback, need to fix docs

* doc changes not being applied

* all fixed

* style fix

* feedback fixes

* fix copies and feature extraction style fix

* Update tests/models/visual_bert/test_modeling_visual_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* copy paste huggingface:main visual bert

* added eof newline to visual bert; all tests are passing otherwise

* fix slow tests by adding attention mask

* change model id to speechbrain

* make fix-copies

* fix readme unwanted deletes

* fixing readmes, make fix-copies

* consistent M-CTC-T naming

* Update src/transformers/models/mctct/__init__.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* all fixed but variable naming

* adjust double quotes

* fixed variable names

* copyright and mr quilter

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* correct slow tests

* make fix-copies

* Update src/transformers/models/mctct/configuration_mctct.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mctct/configuration_mctct.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* m-ctc-t not mctct
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

119e3c0f

Fx support for Deberta-v[1-2], Hubert and LXMERT (#17539) · 5c8f6010

Michael Benayoun authored Jun 07, 2022

* Support for deberta and deberta-v2

* Support for LXMert

* Support for Hubert

* Fix for pt1.11

* Trigger CI

5c8f6010

06 Jun, 2022 2 commits
- Remove circular imports in layoutlm/__init__.py (#17576) · ad719652
  regisss authored Jun 06, 2022
  
  ad719652
- Remove RuntimeErrors for NaN-checking in 20B (#17563) · 2e37ef35
  Jason Phang authored Jun 06, 2022
  
  2e37ef35
03 Jun, 2022 4 commits

Clean imports to fix test_fetcher (#17531) · c4e58cd8

Sylvain Gugger authored Jun 03, 2022



* Clean imports to fix test_fetcher

* Add dependencies printer

* Update utils/tests_fetcher.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Fix Perceiver import
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

c4e58cd8

Fix bug - layer names and activation from previous refactor (#17524) · 1c57242d
amyeroberts authored Jun 03, 2022
```
* Fix activation and layers in MLP head

* Remove unused import
```
1c57242d

Add support for Perceiver ONNX export (#17213) · babeff55

Patrick Deutschmann authored Jun 03, 2022



* Start adding perceiver support for ONNX

* Fix pad token bug for fast tokenizers

* Fix formatting

* Make get_preprocesor more opinionated (processor priority, otherwise tokenizer/feature extractor)

* Clean docs format

* Minor cleanup following @sgugger's comments

* Fix typo in docs

* Fix another docs typo

* Fix one more typo in docs

* Update src/transformers/onnx/utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/onnx/utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/onnx/utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

babeff55

Add Gated-SiLU to T5 (#17420) · 607acd4f

DanielHesslow authored Jun 03, 2022



* Add gated-silu to t5 architecture to support UL2

* Fix error message

* formatting

* formatting again

* refactor

* fix classnames in _init_weights

* remove is_gated

* add test

* fix test

* Try without the test?

* Add back the test.

* Improve error message.
Co-authored-by: Daniel Hesslow <daniel@lighton.ai>

607acd4f

02 Jun, 2022 3 commits
- Implemented loss for training AudioFrameClassification (#17513) · 046c5ea9
  Moreno La Quatra authored Jun 02, 2022
```
* Implemented loss for training AudioFrameClassification

* reported changes in wav2vec2 main class and used make copies to propagate

* running black for code formatting
```
  046c5ea9
- Update configuration_auto.py (#17527) · 085321c9
  Kamal Raj authored Jun 02, 2022
  
  085321c9
- Check list of models in the main README and sort it (#17517) · 048dd73b
  Sylvain Gugger authored Jun 02, 2022
```
* Script for README

* Fix copies

* Complete error message
```
  048dd73b
01 Jun, 2022 7 commits

Adding LeViT Model by Facebook (#17466) · 84aaadd8

Anugunj Naman authored Jun 01, 2022



* levit files

* levit tests

* weights script

* weights script

* update

* style fixes

* few minor corrections

* Added teacher model

* edit docs

* fix-copies

* style fixes

* pr error resolved

* Update README.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/index.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/levit.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/levit.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/levit.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/levit.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/configuration_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/configuration_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* suggested pr changes

* style fixes

* minor bug

* update

* minor doc edit

* style

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/models/levit/test_modeling_levit.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* residual layer readable

* style

* Update docs/source/en/model_doc/levit.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update tests/models/levit/test_feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* change checkpoints and style

* update

* minor changes

* Update src/transformers/models/levit/modeling_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

84aaadd8

Debug LukeForMaskedLM (#17499) · 4d1ce396

Ryokan RI authored Jun 01, 2022

* add a test for a word only input

* make LukeForMaskedLM work without entity inputs

* update test

* add LukeForMaskedLM to MODEL_FOR_MASKED_LM_MAPPING_NAMES

* restore pyproject.toml

* empty line at the end of pyproject.toml

4d1ce396

Refactor classes to inherit from nn.Module instead of nn.Sequential (#17493) · bdc01711
amyeroberts authored Jun 01, 2022
```
* Adapt Maskformer, VAN, ResNet and RegNet modules to inherit from nn.Module
```
bdc01711
Fix wav2vec2 export onnx model with attention_mask error (#16004) · b1160c0b
nilboy authored Jun 01, 2022
```
* Fix wav2vec2 export onnx model with attention_mask error

* fix repository_consistency
```
b1160c0b

Add warning when using older version of torch for ViltFeatureExtractor (#16756) · d91da4c6

Xing Han Lu authored Jun 01, 2022



* Update feature_extraction_vilt.py

* apply black

* Update imports

* Change warning to logging

* Use logger instead of logging.logging

* make fixup

* Move error message

* Update src/transformers/models/vilt/feature_extraction_vilt.py
Co-authored-by: Xing Han Lu <xhlperso@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

d91da4c6

Fix typo of variable names for key and query projection layer (#17155) · 24092b14

Kyeongpil Kang authored Jun 01, 2022

self.pos_proj and self.pos_q_proj should be changed to self.pos_key_proj and self.pos_query_proj as same as PyTorch implements.

24092b14

Add OnnxConfig for SqueezeBert iss17314 (#17315) · 4f38808e

Ruihua Fang authored Jun 01, 2022



* add onnx config for SqueezeBert

* add test for onnx config for SqueezeBert

* add automatically updated doc for onnx config for SqueezeBert

* Update src/transformers/onnx/features.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update src/transformers/models/squeezebert/configuration_squeezebert.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

4f38808e

31 May, 2022 8 commits

[GPT2Tokenizer] Fix GPT2 with bos token (#17498) · ba286fe7
Patrick von Platen authored May 31, 2022

ba286fe7

Opt in flax and tf (#17388) · 7822a9b7

Arthur authored May 31, 2022



* initial commit

* add init file

* update globakl init

* update index and dummy objects

* style

* update modelling auto

* fix initi typo in src/transformers

* fix typo in modeling tf auto, opt was in wrong mapping name

* fixed a slow test : saved_model

* style

* fix positionnal embedding if no position id is provided

* update tf test

* update test flax requirements

* fixed serialization

* update

* update tf name to allow smooth convertion

* update flax tests

* style

* fix test typo

* fix tf typo test

* add xla for generate support in causal LM

* fixed bug

* cleaned tf tests

* style

* removed from PT for slow tests

* fix typp

* opt test as slow

* trying to fix GPT2 undefined

* correct documentation and add to test doc

* update tf doc

* fix doc

* fake commit

* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* update test based on review

* merged main layer for functionning test

* fixup + quality

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update long comment

* make fix copies
Co-authored-by: Arthur <arthur@huggingface.co>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7822a9b7

[Json configs] Make json prettier for all saved tokenizer files & ensure same... · f394a2a5

Patrick von Platen authored May 31, 2022

[Json configs] Make json prettier for all saved tokenizer files & ensure same json format for all processors (tok + feat_extract) (#17457)

* [Json dump] Make json prettier

* correct more tokenizeirs

* more patterns

* add aggressive test

* the aggressive test was actually useful :-)

* more tests

* Apply suggestions from code review

f394a2a5

Fix checkpoint name (#17484) · 8f8b3cbc
Yih-Dar authored May 31, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
8f8b3cbc

Added XLM onnx config (#17030) · 5af38953

Ritik Nandwal authored May 31, 2022

* Add onnx configuration for xlm

* Add supported features for xlm

* Add xlm to models exportable with onnx

* Add xlm architecture to test file

* Modify docs

* Make code quality fixes

5af38953

TF: GPT-2 generation supports left-padding (#17426) · 975dd2bb

Joao Gante authored May 31, 2022

* TF GPT-2 now properly works with left padding

* throw a warning when eos token == pad token and there is no attention mask

975dd2bb

Fx support for multiple model architectures (#17393) · 28d00482

Michael Benayoun authored May 31, 2022

* Support for Bart and LayoutLM, and partial support for XLNet

* Support for mbart

* A lot of new models supported

* Support for other models

* LayoutLM fix

* Use strings instead of classes

28d00482

typo IBERT in __repr__ quant_mode (#17398) · 04681c1d
Ivan Gonzalez authored May 31, 2022
```
fix #17397
```
04681c1d

26 May, 2022 1 commit
- [OPT] Fix bos token id default (#17441) · 7999ec12
  Patrick von Platen authored May 26, 2022
  
  7999ec12
25 May, 2022 1 commit
- Upd AutoTokenizer.from_pretrained doc examples (#17416) · 35e2d13f
  Cookie_thief authored May 25, 2022
  
  35e2d13f