Commits · 6936e7c4875cab371d4d13e5c8e27a6b53276f0e · chenpangpang / transformers

10 Aug, 2022 2 commits

Update philosophy to include other preprocessing classes (#18550) · 6936e7c4
Steven Liu authored Aug 10, 2022
```
* 📝 update philosophy to include other preprocessing classes

* 🖍 apply feedbacks
```
6936e7c4

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) · 4a51075a

Younes Belkada authored Aug 10, 2022



* first commit

* correct replace function

* add final changes

- works like charm!
- cannot implement tests yet
- tested

* clean up a bit

* add bitsandbytes dependencies

* working version

- added import function
- added bitsandbytes utils file

* small fix

* small fix

- fix import issue

* fix import issues

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit

- move bitsandbytes utils to utils
- change comments on functions

* reformat docstring

- reformat docstring on init_empty_weights_8bit

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert bad formatting

* change to bitsandbytes

* refactor a bit

- remove init8bit since it is useless

* more refactoring

- fixed init empty weights issue
- added threshold param

* small hack to make it work

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* revmoe the small hack

* modify utils file

* make style + refactor a bit

* create correctly device map

* add correct dtype for device map creation

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

- remove with torch.grad
- do not rely on Python bool magic!

* add docstring

 - add docstring for new kwargs

* add docstring

- comment `replace_8bit_linear` function
- fix weird formatting

* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add

* few modifs

- typo doc
- force cast into float16 when load_in_8bit is enabled

* added colab link

* add test architecture + docstring a bit

* refactor a bit testing class

* make style + refactor a bit

* enhance checks

- add more checks
- start writing saving test

* clean up a bit

* male style

* add more details on doc

* add more tests

- still needs to fix 2 tests

* replace by "or"

- could not fix it from GitHub GUI
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit testing code + add readme

* make style

* fix import issue

* Update src/transformers/modeling_utils.py
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* add few comments

* add more doctring + make style

* more docstring

* raise error when loaded in 8bit

* make style

* add warning if loaded on CPU

* add small sanity check

* fix small comment

* add bitsandbytes on dockerfile

* Improve documentation

- improve documentation from comments

* add few comments

* slow tests pass on the VM but not on the CI VM

* Fix merge conflict

* make style

* another test should pass on a multi gpu setup

* fix bad import in testing file

* Fix slow tests

- remove dummy batches
- no more CUDA illegal memory errors

* odify dockerfile

* Update docs/source/en/main_classes/model.mdx

* Update Dockerfile

* Update model.mdx

* Update Dockerfile

* Apply suggestions from code review

* few modifications

- lm head can stay on disk/cpu
- change model name so that test pass

* change test value

- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging

* modify installation guidelines

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace `n`by `name`

* merge `load_in_8bit` and `low_cpu_mem_usage`

* first try - keep the lm head in full precision

* better check

- check the attribute `base_model_prefix` instead of computing the number of parameters

* added more tests

* Update src/transformers/utils/bitsandbytes.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers

 into integration-8bit

* improve documentation

- fix typos for installation
- change title in the documentation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

4a51075a

09 Aug, 2022 3 commits
- 📝 update documentation build section (#18548) · 8cf4a6f0
  Steven Liu authored Aug 09, 2022
  
  8cf4a6f0
- 📝 update metric with evaluate (#18535) · 0c183cc2
  Steven Liu authored Aug 09, 2022
  
  0c183cc2
- Add mt5 onnx config (#18394) · 8cb5ecd9
  Thomas Chaigneau authored Aug 09, 2022
```
* update features

* MT5OnnxConfig added with updated with tests and docs

* fix imports

* fix onnc_config_cls for mt5

Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>
```
  8cb5ecd9
08 Aug, 2022 6 commits
- Spanish translation of summarization.mdx (#15947) (#18477) · 499450ed
  AguilaCudicio authored Aug 08, 2022
```
* Add Spanish translation of summarization.mdx

* Apply suggestions from code review
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
```
  499450ed
- Add Spanish translation of converting_tensorflow_models.mdx (#18512) · ed70f242
  Ian Castillo authored Aug 08, 2022
```
* Add file in spanish docs to be translated

* Finish translation to Spanish

* Improve Spanish  wording

* Add suggested changes from review
```
  ed70f242
- Update perf_train_gpu_one.mdx (#18532) · f1f5de31
  Mishig Davaadorj authored Aug 08, 2022
  
  f1f5de31
- Add example of multimodal usage to pipeline tutorial (#18498) · 3632531e
  Steven Liu authored Aug 08, 2022
```
* 📝 add example of multimodal usage to pipeline tutorial

* 🖍 apply feedbacks

* 🖍 apply niels feedback
```
  3632531e
- ✨ update to use interlibrary links instead of Markdown (#18500) · 36b37990
  Steven Liu authored Aug 08, 2022
  
  36b37990
- update fsdp docs (#18521) · 2fecde74
  Sourab Mangrulkar authored Aug 08, 2022
```
* updating fsdp documentation

* typo fix
```
  2fecde74
06 Aug, 2022 1 commit

Just re-reading the whole doc every couple of months

😬

(#18489) · 8d1f9039

Julien Chaumond authored Aug 06, 2022

* Delete valohai.yaml

* NLP => ML

* typo

* website supports https

* datasets

* 60k + modalities

* unrelated link fixing for accelerate

* Ok those links were actually broken

* Fix link

* Make `AutoTokenizer` auto-link

* wording tweak

* add at least one non-nlp task

8d1f9039

05 Aug, 2022 2 commits
- Update some expected values in `quicktour.mdx` for `resampy 0.3.0` (#18484) · 9d64f7f0
  Yih-Dar authored Aug 05, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  9d64f7f0
- Move cache folder to huggingface/hub for consistency with hf_hub (#18492) · faacdf00
  Sylvain Gugger authored Aug 05, 2022
```
* Move cache folder to just huggingface

* Thank you VsCode for this needless import

* Move to hub

* Forgot one
```
  faacdf00
04 Aug, 2022 1 commit

Add VideoMAE (#17821) · f9a0008d

NielsRogge authored Aug 04, 2022



* First draft

* Add VideoMAEForVideoClassification

* Improve conversion script

* Add VideoMAEForPreTraining

* Add VideoMAEFeatureExtractor

* Improve VideoMAEFeatureExtractor

* Improve docs

* Add first draft of model tests

* Improve VideoMAEForPreTraining

* Fix base_model_prefix

* Make model take pixel_values of shape (B, T, C, H, W)

* Add loss computation of VideoMAEForPreTraining

* Improve tests

* Improve model testsé

* Make all tests pass

* Add VideoMAE to main README

* Add tests for VideoMAEFeatureExtractor

* Add integration test

* Improve conversion script

* Rename patch embedding class

* Remove VideoMAELayer from init

* Update design of patch embeddings

* Improve comments

* Improve conversion script

* Improve conversion script

* Add conversion of pretrained model

* Add loss verification of pretrained model

* Add loss verification of unnormalized targets

* Add integration test for pretraining model

* Apply suggestions from code review

* Fix bug to make feature extractor resize only shorter edge

* Address more comments

* Improve normalization of videos

* Add doc examples

* Move constants to dedicated script

* Remove scripts

* Transfer checkpoints, fix docs

* Update script

* Update image mean and std

* Fix doc tests

* Set return_tensors to NumPy by default

* Revert the previous change
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

f9a0008d

03 Aug, 2022 2 commits

Add Spanish translation of run_scripts.mdx (#18415) · 10e1ec9a

Ian Castillo authored Aug 03, 2022

* Add file in spanish docs to be translated

* Translate first two sections to Spanish

* Translate four additional sections to Spanish

* Finish translation to Spanish

* Improve writing style in Spanish

* Add suggested changes from reviewer

10e1ec9a

Update _toctree.yml (#18440) · 92915ebe

Steven Liu authored Aug 03, 2022

This PR moves GroupViT and LXMert to their correct sections. As pointed out by @NielsRogge and @LysandreJik, GroupViT and LXMert are both multimodal models.

92915ebe

02 Aug, 2022 2 commits
- Add programming languages (#18434) · 5096a654
  Christopher Akiki authored Aug 02, 2022
```
The current wording makes it sound as if the programming languages are part of the 46 natural languages.
```
  5096a654
- update maskformer docs (#18423) · 8ae77842
  Alara Dirik authored Aug 02, 2022
```
* update maskformer docs

* fix typo
```
  8ae77842
01 Aug, 2022 4 commits

Split model list on modality (#18328) · 151a2aaa

Steven Liu authored Aug 01, 2022

* 📝

 split up model list

* Adapt script to reorg

* apply niels feedback
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

151a2aaa

Rewrite push_to_hub to use upload_files (#18366) · 01db72ab

Sylvain Gugger authored Aug 01, 2022

* Rewrite push_to_hub to use upload_files

* Adapt the doc a bit

* Address review comments and clean doc

01db72ab

Adding fine-tuning models to LUKE (#18353) · 62098b93

Ikuya Yamada authored Aug 02, 2022

* add LUKE models for downstream tasks

* add new LUKE models to docs

* fix typos

* remove commented lines

* exclude None items from tuple return values

62098b93

Fix docs (#18399) · 7b9e995b

NielsRogge authored Aug 01, 2022


Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

7b9e995b

29 Jul, 2022 2 commits

Replace `as_target` context managers by direct calls (#18325) · 986526a0

Sylvain Gugger authored Jul 29, 2022



* Preliminary work on tokenizers

* Quality + fix tests

* Treat processors

* Fix pad

* Remove all uses of  in tests, docs and examples

* Replace all as_target_tokenizer

* Fix tests

* Fix quality

* Update examples/flax/image-captioning/run_image_captioning_flax.py
Co-authored-by: amyeroberts <amy@huggingface.co>

* Style
Co-authored-by: amyeroberts <amy@huggingface.co>

986526a0

[Docs] Fix Speech Encoder Decoder doc sample (#18346) · a4ee463d
Sanchit Gandhi authored Jul 29, 2022
```
* [Docs] Fix Speech Encoder Decoder doc sample

* improve pre-processing comment

* make style
```
a4ee463d

28 Jul, 2022 3 commits
- Updated _toctree.yml (#18337) · 985c7e3a
  Nicola Procopio authored Jul 28, 2022
  
  985c7e3a
- updated translation (#18333) · a8e27957
  Edoardo Federici authored Jul 28, 2022
```
Left the term fine-tuning since there is no correct translation into Italian and the English term is generally used. The same was done with some terms like "learning rate"
```
  a8e27957
- fixed typo (#18331) · 1e380c7d
  Edoardo Federici authored Jul 28, 2022
  
  1e380c7d
27 Jul, 2022 4 commits

Update feature extractor docs (#18324) · 96be1b7f

Steven Liu authored Jul 27, 2022

As pointed out by @NielsRogge, a feature extractor is used to prepare inputs for a model with a single modality rather than multimodal models.

96be1b7f

start from 1.12, torch_ccl is renamed as oneccl_bindings_for_pytorch … (#18229) · 2b81f72b

Wang, Yi authored Jul 27, 2022



* start from 1.12, torch_ccl is renamed as oneccl_bindings_for_pytorch and should import it before use
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* add doc for perf_train_cpu_many
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* update doc
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

2b81f72b

Add swin transformer v2 (#17469) · e87ac9d1

Ritik Nandwal authored Jul 27, 2022



* Add files generated using transformer-cli add-new-model-like command

* Add changes for swinv2 attention and forward method

* Add fixes

* Add modifications for weight conversion and remaining args in swin model

* Add changes for patchmerging

* Add changes for SwinV2selfattention

* Update conversion script

* Add final fixes for the swin_v2 model

* Add changes for conversion script for pretrained window size case

* Add pretrained window size value from config in SwinV2Encoder class

* Make fixup

* Add swinv2 to models_not_in_readme to utils/check_copies.py

* Modify Swinv2v2 to Swin Transformer V2

* Remove copied from, to run make fixup command

* Add updates to swinv2tf from main branch

* Add pretrained_window_size to config, to make tests pass

* Add modified weights from nandwalritik profile for swinv2

* Update model weights from swinv2 from nandwalritik profile

* Add fix for build_pr_documentation CI fix

* Add fixes for weight conversion

* Add change to make input with padding work

* Add fixes for test cases

* Add few changes from swin to swinv2 to pass test cases

* Remove tests for tensorflow as swinv2 for TF is not added yet

* Overide test_pt_tf_model_equivalence function as TF implementation for swinv2 is not added yet

* Add modeling_tf_swinv2 to _ignore_modules as test file is removed for this one right now.

* Update docs url for swinv2 in README.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Undo changes for check_repo

* Update url in readme.md

* Remove overrided function to test pt_tf_model_equivalence

* Remove TF model imports for Swinv2 as its not implemented in this PR

* Add changes for index.mdx

* Add swinv2 papers link,abstract and contributors details

* Rename cpb_mlp to continous_position_bias_mlp

* Add tips for swinv2 model

* Update src/transformers/models/swinv2/configuration_swinv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/swinv2/configuration_swinv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix indentation for docstring example in src/transformers/models/swinv2/configuration_swinv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update import order in src/transformers/models/swinv2/configuration_swinv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add copyright statements in weights conversion script.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Remove Swinv2 from models_not_in_readme

* Reformat code

* Remove TF implementation file for swinv2

* Update start docstring.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add changes for docstring

* Update orgname for weights to microsoft

* Remove to_2tuple function

* Add copied from statements wherever applicable

* Add copied from to Swinv2ForMaskedImageModelling class

* Reformat code.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add unittest.skip(with reason.) for test_inputs_embeds test case.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add updates for test_modeling_swinv2.py

* Add @unittest.skip() annotation for clarity to create_and_test_config_common_properties function

* Add continuous_position_bias_mlp parameter to conversion script

* Add test for testing masked_image_modelling for swinv2

* Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add suggested changes

* Add copied from to forward methods of Swinv2Stage and Swinv2Encoder

* Add push_to_hub flag to weight conversion script

* Change order or Swinv2DropPath class

* Add id2label mapping for imagenet 21k

* Add updated url for SwinV2 functions and classes used in implementation

* Update input_feature dimensions format, mentioned in comments.
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* Add suggested changes for modeling_swin2.py

* Update docs

* Remove create_and_test_config_common_properties function, as test_model_common_attributes is sufficient.

* Fix indentation.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add changes for making Nit objects in code style

* Add suggested changes

* Add suggested changes for test_modelling_swinv2

* make fix-copies

* Update docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

e87ac9d1

[EncoderDecoder] Improve docs (#18271) · ccd4180f

NielsRogge authored Jul 27, 2022



* Improve docs

* Improve docs of speech one as well

* Apply suggestions from code review
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

ccd4180f

26 Jul, 2022 7 commits

Add Spanish translation of custom_models.mdx (#17807) · a5d50483

Ian Castillo authored Jul 26, 2022

* Update index

* Translate to Spanish two sections from custom_models

* Translate to Spanish custom models documentation

* Fixing typos and grammatical errors

* Add requested changes from reviewer

a5d50483

Add Italian translation of sharing_custom_models.mdx (#17631) · 7ea7eba3

Federico Panero authored Jul 26, 2022



* work in progress: custom_models

* Update custom_models.mdx

* Update custom_models.mdx

* Update _toctree.yml

* Update _toctree.yml

* Update custom_models.mdx

* Update custom_models.mdx

* Update _toctree.yml

* Update _toctree.yml
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7ea7eba3

Add Italian translation of converting_tensorflow_models.mdx (#18283) · bbc28106

Federico Panero authored Jul 26, 2022



* Add Italian translation of converting_tensorflow_models.mdx

* Update _toctree.yml

* Update converting_tensorflow_models.mdx

* Update docs/source/it/_toctree.yml
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

bbc28106

[ create_a_model.mdx ] translate to pt (#18098) · 5e0ffd91

Fellip Silva Alves authored Jul 26, 2022



* [ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial

* Delete docs/source/pt-br directory

* [ fast_tokenizers.mdx ] - Continuing work on file

* [ fast_tokenizers.mdx ] - Continuing work on file

* Add fast tokenizers to _toctree.yml

* Eliminated config and toctree.yml

* Nits in fast_tokenizers.mdx

* Finishing create_a_model

* [ create_a_model.mdx ] finishing create a model in pt-br

* [ Changing _toctree.yml ] adding create a model in pt
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

5e0ffd91

Update translation.mdx (#18169) · f58b9c05
Gorkem Ozkaya authored Jul 26, 2022
```
* Update translation.mdx

* update translation.mdx by running make style
```
f58b9c05

Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER) (#17924) · 2b096508

gilad19 authored Jul 26, 2022



* Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER)

* Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER)

* provide classifier only text hidden states

* add test_for_token_classification

* Update src/transformers/models/vilt/modeling_vilt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add test_for_token_classification
Co-authored-by: gfuchs <gfuchs@ebay.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

2b096508

Owlvit docs test (#18257) · 002915aa

Alara Dirik authored Jul 26, 2022

* fix docs and add owlvit docs test

* fix minor bug in post_process, add to processor

* improve owlvit code examples

* fix hardcoded image size

002915aa

22 Jul, 2022 1 commit
- change bloom parameters to 176B (#18235) · 7cb4da13
  Muhammad Ahmed authored Jul 22, 2022
  
  7cb4da13