Commits · 4cf38148dc98b3df1df6eb2f06e4f02448026b19 · chenpangpang / transformers

21 Nov, 2022 5 commits

Generate: `model_kwargs` can also be an input to `prepare_inputs_for_generation` (#20353) · 4cf38148
Joao Gante authored Nov 21, 2022

4cf38148

add MobileNetV1 model (#17799) · d21c97cc

Matthijs Hollemans authored Nov 21, 2022

* add model files etc for MobileNetV2

rename files for MobileNetV1

initial implementation of MobileNetV1

fix conversion script

cleanup

write docs

tweaks

fix conversion script

extract hidden states

fix test cases

make fixup

fixup it all

remove main from doc link

fixes

fix tests

fix up

use google org

fix weird assert

* fixup

* use google organization for checkpoints

d21c97cc

[Switch Transformers] Fix failing slow test (#20346) · 74297d0a
Younes Belkada authored Nov 21, 2022
```
* run slow test on GPU

* remove unnecessary device assignment

* use `torch_device` instead
```
74297d0a

Generate: add generation config class (#20218) · 3de07473

Joao Gante authored Nov 21, 2022


Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3de07473

Fix torch device issues (#20304) · 8503cc75

Yih-Dar authored Nov 21, 2022



* fix device issue
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

8503cc75

18 Nov, 2022 4 commits

Add Neighborhood Attention Transformer (NAT) and Dilated NAT (DiNAT) models (#20219) · fc4a993e

Ali Hassani authored Nov 18, 2022

* Add DiNAT

* Adds DiNAT + tests

* Minor fixes

* Added HF model

* Add natten to dependencies.

* Cleanup

* Minor fixup

* Reformat

* Optional NATTEN import.

* Reformat & add doc to _toctree

* Reformat (finally)

* Dummy objects for DiNAT

* Add NAT + minor changes

Adds NAT as its own independent model + docs, tests
Adds NATTEN to ext deps to ensure ci picks it up.

* Remove natten from `all` and `dev-torch` deps, add manual pip install to ci tests

* Minor fixes.

* Fix READMEs.

* Requested changes to docs + minor fixes.

* Requested changes.

* Add NAT/DiNAT tests to layoutlm_job

* Correction to Dinat doc.

* Requested changes.

fc4a993e

[Proposal] Breaking change `zero-shot-object-detection` for improved consistency. (#20280) · 8e777b3b

Nicolas Patry authored Nov 18, 2022

* [Proposal] Breaking change `zero-shot-object-detection` for improved
consistency.

This is a proposal to modify the output of `zero-shot-object-detection`
to provide better alignment with other pipelines.

The output is now strictly the same as `object-detection` whereas before
it would output lists of lists.

The name `candidate_labels` is used throughout for consistency with
other `zero-shot` pipelines.

The pipeline is changed to `ChunkPipeline` to support batching cleanly.

This removes all the lists and list of lists shenanigans, it's now a
matter of the base pipeline handling all this not this specific one.

**Breaking change**: It did remove complex calls potentials `pipe(images = [image1, image2],
text_queries=[candidates1, candidates2])` to support only
`pipe([{"image": image1, "candidate_labels": candidates1}, {"image": image2, "candidate_labels": candidates2}])`
when dealing with lists and/or datasets.
We could keep them, but it will add a lot of complexity to the code
base, since the pipeline is rather young, I'd rather break to keep the
code simpler, but we can revert this.

**Breaking change**: The name of the argument is now `image` instead of
`images` since it expects by default only 1 image. This is revertable
like the previous one.

**Breaking change**: The types is now simplified and flattened:

`pipe(inputs) == [{**object1}, {**object2}]`
instead of the previous
`pipe(inputs) == [[{**object1}, {**object1}], [{**object2}]]`
Where the different instances would be grouped by candidate labels
within lists.
IMHO this is not really desirable, since it would output empty lists and
is only adding superflous indirection compared to
`zero-shot-object-detection`.

It is relatively change free in terms of how the results, it does change
computation however since now the batching is handled by the pipeline
itself. It **did** change the results for the small models so there
seems to be a real difference in how the models handle this.

* Fixing the doctests.

* Behind is_torch_available.

8e777b3b

Add AnyPrecisionAdamW optimizer (#18961) · 84c9cc6d

atturaioe authored Nov 18, 2022

* Add AnyPrecisionAdamW optimizer

* Add optim_args argument to TrainingArgs

* Add tests for AnyPrecisionOptimizer

* Change AnyPrecisionAdam default params to float32

* Move default_anyprecision_kwargs in trainer test

* Rename AnyPrecisionAdamW

84c9cc6d

Add padding image transformation (#19838) · b9826942

amyeroberts authored Nov 18, 2022

* Add padding transformation

* Add in upstream changes

* Update tests & docs

* Code formatting tuples in docstring

b9826942

17 Nov, 2022 5 commits

refactor test (#20300) · 4bb07647
Younes Belkada authored Nov 17, 2022
```
- simplifies the devce checking test
```
4bb07647

Add AutoBackbone + ResNetBackbone (#20229) · 6b217c52

NielsRogge authored Nov 17, 2022



* Add ResNetBackbone

* Define channels and strides as property

* Remove file

* Add test for backbone

* Update BackboneOutput class

* Remove strides property

* Fix docstring

* Add backbones to SHOULD_HAVE_THEIR_OWN_PAGE

* Fix auto mapping name

* Add sanity check for out_features

* Set stage names based on depths

* Update to tuple
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

6b217c52

Generate: general TF XLA constrastive search are now slow tests (#20277) · 0f78529f
Joao Gante authored Nov 17, 2022
```
* move contrastive search test to slow
```
0f78529f
TF: add test for `PushToHubCallback` (#20231) · 2062c285
Joao Gante authored Nov 17, 2022
```
* test hub tf callback

* create repo before cloning it
```
2062c285

[bnb] Let's warn users when saving 8-bit models (#20282) · 7d65efec

Younes Belkada authored Nov 17, 2022



* add warning on 8-bit models

- added tests
- added wrapper

* move to a private attribute

- remove wrapper
- changed `save_pretrained` method

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7d65efec

16 Nov, 2022 2 commits

Data collator for token classification pads labels column when receives pytorch tensors (#20244) · 610acc5a

Alexander Markov authored Nov 16, 2022



* token cls data_collator pads labels column

* remove walrus operator for code quality

* remove redundat space

* remove comment that was fixed

* PR comments fix
Co-authored-by: Alexander Markov <amarkov.me@gmail.com>

610acc5a

Adds image-guided object detection support to OWL-ViT (#20136) · a00b7e85

Alara Dirik authored Nov 16, 2022

Adds image-guided object detection method to OwlViTForObjectDetection class as described in the original paper. One-shot/ image-guided object detection enables users to use a query image to search for similar objects in the input image.

Co-Authored-By: Dhruv Karan k4r4n.dhruv@gmail.com

a00b7e85

15 Nov, 2022 10 commits

Slightly alter Keras dummy loss (#20232) · 26ec7928

Matt authored Nov 15, 2022

* Slightly alter Keras dummy loss

* Slightly alter Keras dummy loss

* Add sample weight to test_keras_fit

* Fix test_keras_fit for datasets

* Skip the sample_weight stuff for models where the model tester has no batch_size

26ec7928

[CLIP] allow loading projection layer in vision and text model (#18962) · 7f744338

Suraj Patil authored Nov 15, 2022



* allow loading projection in text and vision model

* begin tests

* finish test for CLIPTextModelTest

* style

* add slow tests

* add new classes for projection heads

* remove with_projection

* add in init

* add in doc

* fix tests

* fix some more tests

* fix copies

* fix docs

* remove leftover from fix-copies

* add the head models in IGNORE_NON_AUTO_CONFIGURED

* fix docstr

* fix tests

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add docstr for models
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

7f744338

Enable PyTorch 1.13 (#20168) · 9643ecf8

Sylvain Gugger authored Nov 15, 2022

* Try PT1.13 by removing torch scatter

* Skip failing tests

* Style

* Remvoe testing extras for repo utils

* Try with all decorators

* Try to wipe the cache

* Fix all tests?

* Try this way

* Fix comma

* Update to main

* Try with less deps

* Quality

9643ecf8

Fix MaskformerFeatureExtractor (#20100) · b4997382

NielsRogge authored Nov 15, 2022



* Fix bug

* Add another fix

* Add print statement

* Apply fix

* Fix feature extractor

* Fix feature extractor

* Add print statements

* Add print statements

* Remove print statements

* Add instance segmentation integration test

* Add integration test for semantic segmentation

* Add draft for panoptic segmentation integration test

* Fix integration test for panoptic segmentation

* Remove slow annotator
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

b4997382

Add object detection + segmentation transforms (#20003) · 4c7e8d09

amyeroberts authored Nov 15, 2022



* Add transforms for object detection

* Update src/transformers/image_transforms.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Better var names & docstring

* Remove unused var desc in docstring

* Update src/transformers/image_transforms.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

4c7e8d09

Add Switch transformers (#19323) · 163ac3d3

Younes Belkada authored Nov 15, 2022



* first commit

* add more comments

* add router v1

* clean up

- remove `tf` modeling files

* clean up

- remove `tf` modeling files

* clean up

* v0 routers

* added more router

- Implemented `ExpertsChooseMaskedRouter`

- added tests
- 2 more routers to implement

* last router

* improved docstring

- completed the docstring in `router.py`
- added more args in the config

* v0 sparse mlp

* replace wrong naming

* forward pass run

* update MOE layer

* small router update

* fixup

* consistency

* remove scatter router

* remove abstract layer

* update test and model for integration testing

* v1 conversion

* update

* hardcode hack

* all keys match

* add gin conversion, without additional libraries

* update conversion sctipy

* delete router file

* update tests wrt router deletion

* fix router issues

* update expert code

* update, logits match, code needsREFACTORING

* Refactor code
Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* add generate tests
Co-authored-by: younesbelkada <younesbelkada@gmail.com>

* add support for router loss
Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* fix forward error

* refactor a bit

* remove `FlaxSwitchTransformers` modules

* more tests pass

* Update code
Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* fixup

* fix tests

* fix doc

* fix doc + tokenization

* fix tokenizer test

* fix test

* fix loss output

* update code for backward pass

* add loss support

* update documentation

* fix documentation, clean tokenizer

* more doc fix, cleanup example_switch

* fix failing test

* fix test

* fix test

* fix loss issue

* move layer

* update doc and fix router capacity usage

* fixup

* add sparse mlp index for documentation on hub

* fixup

* test sparse mix architecture

* Apply suggestions from code review

* Update docs/source/en/model_doc/switch_transformers.mdx

* fixup on update

* fix tests

* fix another test

* attempt fix

* Update src/transformers/models/switch_transformers/configuration_switch_transformers.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/switch_transformers/convert_switch_transformers_original_flax_checkpoint_to_pytorch.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* try

* all tests pass

* fix jitter noise

* Apply suggestions from code review

* doc tests pass

* Update src/transformers/models/switch_transformers/modeling_switch_transformers.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/switch_transformers/modeling_switch_transformers.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove assert

* change config order

* fix readme japanese

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove parallelizable tests + add one liners

* remove ONNX config

* fix nits

- add `T5Tokenizer` in auto mapping
- remove `Switch Transformers` from ONNX supported models

* remove `_get_router`

* remove asserts

* add check in test for `router_dtype`

* add `SwitchTransformersConfig` in `run_pipeline_test`

* Update tests/pipelines/test_pipelines_summarization.py

* add huge model conversion script

* fix slow tests

- add better casting for `Linear8bitLt`
- remove `torchscript` tests

* add make dir

* style on new script

* fix nits

- doctest
- remove `_keys_to_ignore_on_load_unexpected`

* Update src/transformers/models/switch_transformers/configuration_switch_transformers.py

* add google as authors

* fix year

* remove last `assert` statements

* standardize vertical spaces

* fix failing import

* fix another failing test

* Remove strange àuthorized_keys`

* removing todo and padding that is never used
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: ybelkada <younes@huggingface.co>
Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur Zucker <arthur@huggingface.co>

163ac3d3

Add `accelerate` support for `ViT` family (#20174) · f1e8c48c

Younes Belkada authored Nov 15, 2022

* add `accelerate` support for `ViT` family

- add `_no_split_modules`
- manually cast to the right `dtype`: to change

* enable `float16` for `deit`

* fix `make fixup`

* add `slow` test for `fp16` inference

* another safety check

* Update src/transformers/models/deit/modeling_deit.py

f1e8c48c

[WHISPER] Update modeling tests (#20162) · 11b2e45c

Arthur authored Nov 15, 2022



* Update modeling tests

* update tokenization test

* typo

* nit

* fix expected attention outputs

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update tests from review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* remove problematics kwargs passed to the padding function
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

11b2e45c

update relative positional embedding (#20203) · f60eec40

Arthur authored Nov 15, 2022

* update relative positional embedding

* make fix copies

* add `use_cache` to list of arguments

* fixup

* 1line fucntion

* add `test_decoder_model_past_with_large_inputs_relative_pos_emb`

* add relative pos embedding test for more models

* style

f60eec40

Make `ImageSegmentationPipelineTests` less flaky (#20147) · f9909fbf

Yih-Dar authored Nov 15, 2022



* Fix ImageSegmentationPipelineTests

* Use 0.9

* no zip

* links to show images

* links to show images

* rebase
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f9909fbf

14 Nov, 2022 6 commits

Adding chunking for whisper (all seq2seq actually). Very crude matching algorithm. (#20104) · 25c451e5

Nicolas Patry authored Nov 14, 2022

* Very crude matching algorithm.

* Fixing tests.

* Removing comments

* Adding warning + fix short matches.

* Cleanup tests.

* Quality.

* Less noisy.

* Fixup.

25c451e5

Generate: add Bloom fixes for contrastive search (#20213) · 938cb047
Joao Gante authored Nov 14, 2022

938cb047
mark `test_save_load_fast_init_from_base` as `is_flaky` (#20200) · 536e60d2
Yih-Dar authored Nov 14, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
536e60d2
[ROC_BERT] Make CI happy (#20175) · 8dcf494e
Younes Belkada authored Nov 14, 2022
```
* fix slow test

* Update tests/models/roc_bert/test_modeling_roc_bert.py
```
8dcf494e

Fix tapas scatter (#20149) · 78a471ff

Bartosz Szmelczynski authored Nov 14, 2022



* First draft

* Remove scatter dependency

* Add require_torch

* update vectorized sum test, add clone call

* remove artifacts

* fix style

* fix style v2

* remove "scatter" mentions from the code base

* fix isort error
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

78a471ff

add MobileNetV2 model (#17845) · f711d683

Matthijs Hollemans authored Nov 14, 2022

* add model files etc for MobileNetV2

* rename files for MobileNetV1

* initial implementation of MobileNetV1

* fix conversion script

* cleanup

* write docs

* tweaks

* fix conversion script

* extract hidden states

* fix test cases

* make fixup

* fixup it all

* rename V1 to V2

* fix checkpoints

* fixup

* implement first block + weight conversion

* add remaining layers

* add output stride and dilation

* fixup

* add tests

* add deeplabv3+ head

* a bit of fixup

* finish deeplab conversion

* add link to doc

* fix issue with JIT trace

in_height and in_width would be Tensor objects during JIT trace, which caused Core ML conversion to fail on the remainder op. By making them ints, the result of the padding calculation becomes a constant value.

* cleanup

* fix order of models

* fix rebase error

* remove main from doc link

* add image processor

* remove old feature extractor

* fix converter + other issues

* fixup

* fix unit test

* add to onnx tests (but these appear broken now)

* add post_process_semantic_segmentation

* use google org

* remove unused imports

* move args

* replace weird assert

f711d683

11 Nov, 2022 2 commits

Fix type - update any PIL.Image.Resampling (#20172) · 6cc06d17
amyeroberts authored Nov 11, 2022

6cc06d17

[OWL-ViT] Make model consistent with CLIP (#20144) · cbbeca3d

NielsRogge authored Nov 11, 2022



* Apply fix

* Fix test

* Remove another argument which is not used

* Fix pipeline test

* Add argument back, add deprecation warning

* Add warning add other location

* Use warnings instead

* Add num_channels to config
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>

cbbeca3d

10 Nov, 2022 4 commits

Add Jukebox model (replaces #16875) (#17826) · 61a51f5f
Arthur authored Nov 10, 2022

61a51f5f
Skip broken test · 9740a03f
Sylvain Gugger authored Nov 10, 2022

9740a03f

[processor] Add 'model input names' property (#20117) · 905e5773

Sanchit Gandhi authored Nov 10, 2022

* [processor] Add 'model input names' property

* add test

* no f string

* add generic property method to mixin

* copy to multimodal

* copy to vision

* tests for all audio

* remove ad-hoc tests

* style

* fix flava test

* fix test

* fix processor code

905e5773

Adding support for LayoutLMvX variants for `object-detection`. (#20143) · d066c373

Nicolas Patry authored Nov 10, 2022

* Adding support for LayoutLMvX variants for `object-detection`.

* Revert bogs `layoutlm` feature extractor which does not exist (it was a
V2 model) .

* Updated condition.

* Handling the comments.

d066c373

09 Nov, 2022 2 commits

Update VisionEncoderDecoder to use an image processor (#20137) · f3d99e49

amyeroberts authored Nov 09, 2022

* TrOCR processor uses an image processor

* Update VisionEncoderDecoder

* Add feature_extractor_class property

f3d99e49

Generate: move generation_*.py src files into generation/*.py (#20096) · f270b960

Joao Gante authored Nov 09, 2022

* move generation_*.py src files into generation/*.py

* populate generation.__init__ with lazy loading

* move imports and references from generation.xxx.object to generation.object

f270b960