Commits · eb3ded16f730019b223b06514a8d03df4199feb1 · chenpangpang / transformers

09 Aug, 2023 2 commits
- Generate: generation config validation fixes in docs (#25405) · f456b4d1
  Joao Gante authored Aug 09, 2023
  
  f456b4d1
- Docs: introduction to generation with LLMs (#25240) · d59b872c
  Joao Gante authored Aug 09, 2023
```
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
```
  d59b872c
07 Aug, 2023 1 commit

Docs: Added benchmarks for `torch.compile()` for vision models (#24748) · 5ee9693a

Merve Noyan authored Aug 07, 2023



* added benchmarks for compile

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* added more models

* added more models fr

* added visualizations

* minor fix

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/perf_torch_compile.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Added links to models and put charts side by side

* Added batch comparisons

* Added more comparisons

* Fix table

* Added link to wheel

* Update perf_torch_compile.md

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

5ee9693a

04 Aug, 2023 1 commit

Document check copies (#25291) · f0fd73a2

Sylvain Gugger authored Aug 04, 2023

* Document check copies better and add tests

* Include header in check for copies

* Manual fixes

* Try autofix

* Fixes

* Clean tests

* Finalize doc

* Remove debug print

* More fixes

f0fd73a2

03 Aug, 2023 5 commits

Fix typo: Roberta -> RoBERTa (#25302) · 641adca5
Victor Geislinger authored Aug 03, 2023

641adca5
[small] llama2.md typo (#25295) · 33da2db5
Howard Huang authored Aug 03, 2023
```
`groupe` -> `grouped`
```
33da2db5

add generate method to SpeechT5ForTextToSpeech (#25233) · 6d3f9c1e

Yoach Lacombe authored Aug 03, 2023



* add generate method to SpeechT5ForTextToSpeech

* update speecht5forTTS docstrings

* Remove defaults to None in generate docstrings
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

6d3f9c1e

Update bark doc (#25234) · 8455346c

Yoach Lacombe authored Aug 03, 2023



* add mention to optimization in Bark docs

* add offload mention in docs

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update bark docs.

* Update bark.md

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

8455346c

Docs: separate generate section (#25235) · a8817371
Joao Gante authored Aug 03, 2023
```
Separate generate doc section
```
a8817371

02 Aug, 2023 1 commit
- recommend DeepSpeed's Argument Parsing documentation (#25268) · ad832151
  Kevin Lloyd Bernal authored Aug 02, 2023
  
  ad832151
01 Aug, 2023 1 commit

[`Docs`/`quantization`] Clearer explanation on how things works under the... · 972fdcc7

Younes Belkada authored Aug 01, 2023


[`Docs`/`quantization`] Clearer explanation on how things works under the hood. + remove outdated info (#25216)

* clearer explanation on how things works under the hood.

* Update docs/source/en/main_classes/quantization.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/main_classes/quantization.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add `load_in_4bit` in `from_pretrained`

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

972fdcc7

31 Jul, 2023 1 commit
- [quantization.md] fix (#25190) · 52206066
  Stas Bekman authored Jul 31, 2023
```
Update quantization.md
```
  52206066
27 Jul, 2023 1 commit

Add bloom flax (#25094) · e9310363

Sanchit Gandhi authored Jul 27, 2023



* First commit

* step 1 working

* add alibi

* placeholder for `scan`

* add matrix mult alibi

* beta scaling factor for bmm

* working v1 - simple forward pass

* move layer_number from attribute to arg in call

* partial functioning scan

* hacky working scan

* add more modifs

* add test

* update scan for new kwarg order

* fix position_ids problem

* fix bug in attention layer

* small fix

- do the alibi broadcasting only once

* prelim refactor

* finish refactor

* alibi shifting

* incorporate dropout_add to attention module

* make style

* make padding work again

* update

* remove bogus file

* up

* get generation to work

* clean code a bit

* added small tests

* adding albii test

* make CI tests pass:

- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work

* fix few nits

* fix nit onnx

* fix onnx nit

* add missing dtype args to nn.Modules

* remove debugging statements

* fix scan generate

* Update modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* fix small test issue + make style

* clean up

* Update tests/models/bloom/test_modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* fix function name

* small fix test

* forward contrib credits from PR17761

* Fix failing test

* fix small typo documentation

* fix non passing test

- remove device from build alibi

* refactor call

- refactor `FlaxBloomBlockCollection` module

* make style

* upcast to fp32

* cleaner way to upcast

* remove unused args

* remove layer number

* fix scan test

* make style

* fix i4 casting

* fix slow test

* Update src/transformers/models/bloom/modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove `layer_past`

* refactor a bit

* fix `scan` slow test

* remove useless import

* major changes

- remove unused code
- refactor a bit
- revert import `torch`

* major refactoring

- change build alibi

* remove scan

* fix tests

* make style

* clean-up alibi

* add integration tests

* up

* fix batch norm conversion

* style

* style

* update pt-fx cross tests

* update copyright

* Update src/transformers/modeling_flax_pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* per-weight check

* style

* line formats

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

e9310363

25 Jul, 2023 7 commits

Fix doctest (#25031) · da5ff18a

Yih-Dar authored Jul 25, 2023



fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

da5ff18a

[`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726) · 8f36ab3e

Sebastian Husch Lee authored Jul 25, 2023

* Initial addition of t5forsequenceclassification

* Adding imports and adding tests

* Formatting

* Running make fix-copies

* Adding mt5forseq

* Formatting

* run make fix-copies

* Adding to docs

* Add model_parallel

* Fix bug

* Fix

* Remove TODO

* Fixing tests for T5ForSequenceClassification

* Undo changes to dependency_versions_table.py

* Change classification head to work with T5Config directly

* Change seq length to let tests pass

* PR comments for formatting

* Formatting

* Initial addition of UMT5ForSequenceClassification

* Adding to inits and formatting

* run make fix-copies

* Add doc for UMT5ForSeqClass

* Update UMT5 config

* Fix docs

* Skip torch fx test for SequenceClassification

* Formatting

* Add skip to UMT5 tests as well

* Fix umt5 tests

* Running make fix-copies

* PR comments

* Fix for change to sentence_representation

* Rename seq_len to hidden_size since that's what it is

* Use base_model to follow format of the rest of the library

* Update docs

* Extract the decoder_input_ids changes and make one liner

* Make one-liner

8f36ab3e

[`MPT`] Add MosaicML's `MPT` model to transformers (#24629) · dcb183f4

Arthur authored Jul 25, 2023



* draft add new model like

* some cleaning of the config

* nits

* add nested configs

* nits

* update

* update

* added layer norms + triton kernels

* consider only LPLayerNorm for now.

* update

* all keys match.

* Update

* fixing nits here and there

* working forward pass.

* removed einops dependency

* nits

* format

* add alibi

* byebye head mask

* refactor attention

* nits.

* format

* fix nits.

* nuke ande updates

* nuke tokenizer test

* don't reshape query with kv heads

* added a bit of documentation.

* remove unneeded things

* nuke more stuff

* nit

* logits match - same generations

* rm unneeded methods

* 1 remaining failing CI test

* nit

* fix nits

* fix docs

* fix docs

* rm tokenizer

* fixup

* fixup

* fixup and fix tests

* fixed configuration object.

* use correct activation

* few minor fixes

* clarify docs a bit

* logits match à 1e-12

* skip and unskip a test

* added some slow tests.

* fix readme

* add more details

* Update docs/source/en/model_doc/mpt.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix configuration issues

* more fixes in config

* added more models

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove unneeded position ids

* fix some  comments

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* revert suggestion

* mpt alibi + added batched generation

* Update src/transformers/models/mpt/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove init config

* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix nit

* add another slow test

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fits in one line

* some refactor because make fixup doesn't pass

* add ft notebook

* update md

* correct doc path

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

dcb183f4

Set `TF32` flag for PyTorch cuDNN backend (#25075) · 6bc61aa7
Xuehai Pan authored Jul 25, 2023

6bc61aa7
fix: add TOC anchor link (#25066) · 5dba88b2
Injin Paek authored Jul 25, 2023

5dba88b2

🌐

[i18n-KO] Translated `perf_hardware.md` to Korean (#24966) · ee1eb3b3

Sangam Lee authored Jul 25, 2023



* docs: ko: perf_hardware.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* fix: resolve suggestions
Co-authored-by: Haewon Kim <ehdvkf02@naver.com>

* Fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: fix rendering error of perf_hardware.md

---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Haewon Kim <ehdvkf02@naver.com>

ee1eb3b3

[`RWKV`] Add note in doc on `RwkvStoppingCriteria` (#25055) · c53a6eae
Arthur authored Jul 25, 2023
```
* Add note in doc on `RwkvStoppingCriteria`

* give some breathing space to the code
```
c53a6eae

24 Jul, 2023 2 commits

Pvt model (#24720) · a03d13c8

Rinat authored Jul 24, 2023

* pull and push updates

* add docs

* fix modeling

* Add and run test

* make copies

* add task

* fix tests and fix small issues

* Checks on a Pull Request

* fix docs

* add desc pvt.md

a03d13c8

[docs] Performance docs tidy up, part 1 (#23963) · 75317aef

Maria Khalusova authored Jul 24, 2023



* first pass at the single gpu doc

* overview: improved clarity and navigation

* WIP

* updated intro and deepspeed sections

* improved torch.compile section

* more improvements

* minor improvements

* make style

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* feedback addressed

* mdx -> md

* link fix

* feedback addressed

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

75317aef

21 Jul, 2023 3 commits

Remove tokenizers from the doc table (#24963) · 640e1b6c
Sylvain Gugger authored Jul 21, 2023

640e1b6c

fsdp fixes and enhancements (#24980) · f4eb459e

Sourab Mangrulkar authored Jul 21, 2023

* fix fsdp prepare to remove the warnings and fix excess memory usage

* Update training_args.py

* parity for FSDP+XLA

* Update trainer.py

f4eb459e

🌐

[i18n-KO] Fixed Korean and English `quicktour.md` (#24664) · ec3dfe5e

Wonhyeong Seo authored Jul 21, 2023



* fix: english/korean quicktour.md

* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>

* fix: follow glossary

* 파인튜닝 -> 미세조정

---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>

ec3dfe5e

20 Jul, 2023 1 commit

Deprecate unused OpenLlama architecture (#24922) · 79444f37

Tom Aarsen authored Jul 20, 2023

* Resolve typo in check_repo.py

* Specify encoding when opening modeling files

* Deprecate the OpenLlama architecture

* Add disclaimer pointing to Llama

I'm open to different wordings here

* Match the capitalisation of LLaMA

79444f37

19 Jul, 2023 2 commits

Fix minor llama2.md model doc typos (#24909) · 3a43794d
Travis Cline authored Jul 19, 2023
```
Update llama2.md

 Fix typos in the llama2 model doc
```
3a43794d

Update tested versions in READMEs (#24895) · c0359702

Eliah Kagan authored Jul 19, 2023

* Update supported Python and PyTorch versions in readme

* Update Python, etc. versions in non-English readmes

These were more out of date than in the English readme. This
updates all the versions the readmes claim the repository is tested
with to the same versions stated in the English readme.

Those versions are current at least in the case of the Python and
PyTorch versions (and less out of date for the others).

* Propagate trailing whitespace fix to model list

This runs "make fix-copies". The only change is the removal of
whitespace. No actual information or wording is changed.

* Update tested TensorFlow to 2.6 in all readmes

Per pinning in setup.py

Unlike Python and PyTorch, the minimum supported TensorFlow version
has not very recently changed, but old versions were listed in all
READMEs.

c0359702

18 Jul, 2023 3 commits

[`Llama2`] Add support for Llama 2 (#24891) · 07360b6c

Arthur authored Jul 18, 2023



* add llama

* add other readmes

* update padding id in readme

* add link to paper

* fix paths and tokenizer

* more nits

* styling

* fit operation in 2 lines when possible

* nits

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add form

* update reademe

* update readme, we don't have a default pad token

* update test and tokenization

* LLaMA instead of Llama

* nits

* add expected text

* add greeedy output

* styling

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* sequential device map

* skip relevant changes

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

07360b6c

Add DINOv2 (#24016) · 3ec10e6c

NielsRogge authored Jul 18, 2023

* First draft

* More improvements

* Convert patch embedding layer

* Convert all weights

* Make conversion work

* Improve conversion script

* Fix style

* Make all tests pass

* Add image processor to auto mapping

* Add swiglu ffn

* Add image processor to conversion script

* Fix conversion of giant model

* Fix documentation

* Fix style

* Fix tests

* Address comments

* Address more comments

* Remove unused arguments

* Remove more arguments

* Rename parameters

* Include mask token

* Address comments

* Add docstring

* Transfer checkpoints

* Empty commit

3ec10e6c

[`Docs`] Clarify 4bit docs (#24878) · ca974aff

Younes Belkada authored Jul 18, 2023



* clarify 4bit docs

* Apply suggestions from code review
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

---------
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

ca974aff

17 Jul, 2023 3 commits

Add bark (#24086) · f42a35e6

Yoach Lacombe authored Jul 17, 2023



* first raw version of the bark integration

* working code on small models with single run

* add converting script from suno weights 2 hf

* many changes

* correct past_kv output

* working implementation for inference

* update the converting script according to the architecture changes

* add a working end-to-end inference code

* remove some comments and make small changes

* remove unecessary comment

* add docstrings and ensure no unecessary intermediary output during audio generation

* remove done TODOs

* make style + add config docstrings

* modification for batch inference support on the whole model

* add details to .generation_audio method

* add copyright

* convert EncodecModel from original library to transformers implementation

* add two class in order to facilitate model and sub-models loading from the hub

* add support of loading the whole model

* add BarkProcessor

* correct modeling according to processor output

* Add proper __init__ and auto support

* Add up-to-date copyright/license message

* add relative import instead of absolute

* cleaner head_dim computation

* small comment removal or changes

* more verbose LayerNorm init method

* specify eps for clearer comprehension

* more verbose variable naming in the MLP module

* remove unecessary BarkBlock parameter

* clearer code in the forward pass of the BarkBlock

* remove _initialize_modules method for cleaner code

* Remove unnecessary methods from sub-models

* move code to remove unnecessary function

* rename a variable for clarity and change an assert

* move code and change variable name for clarity

* remove unnecessary asserts

* correct small bug

* correct a comment

* change variable names for clarity

* remove asserts

* change import from absolute to relative

* correct small error due to comma missing + correct import

* Add attribute Bark config

* add first version of tests

* update attention_map

* add tie_weights and resize_token_embeddings for fineModel

* correct getting attention_mask in generate_text_semantic

* remove Bark inference trick

* leave more choices in barkProcessor

* remove _no_split_modules

* fixe error in forward of block and introduce clearer notations

* correct converting script with last changes

* make style + add draft bark.mdx

* correct BarkModelTest::test_generate_text_semantic

* add Bark in main README

* add dummy_pt_objects for Bark

* add missing models in the main init

* correct test_decoder_model_past_with_large_inputs

* disable torchscript test

* change docstring of BarkProcessor

* Add test_processor_bark

* make style

* correct copyrights

* add bark.mdx + make style, quality and consistency

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Remove unnecessary test method

* simply logic of a test

* Only check first ids for slow audio generation

* split full end-to-end generation tests

* remove unneccessary comment

* change submodel names for clearer naming

* remove ModuleDict from modeling_bark

* combine two if statements

* ensure that an edge misued won't happen

* modify variable name

* move code snippet to the right place (coarse instead of semantic)

* change BarkSemanticModule -> BarkSemanticModel

* align BarkProcessor with transformers paradigm

* correct BarkProcessor tests with last commit changes

* change _validate_voice_preset to an instance method instead of a class method

* tie_weights already called with post_init

* add codec_model config to configuration

* update bark modeling tests with recent BarkProcessor changes

* remove SubModelPretrainedModel + change speakers embeddings prompt type in BarkModel

* change absolute imports to relative

* remove TODO

* change docstrings

* add examples to docs and docstrings

* make style

* uses BatchFeature in BarkProcessor insteads of dict

* continue improving docstrings and docs + make style

* correct docstrings examples

* more comprehensible speaker_embeddings load/Save

* rename speaker_embeddings_dict -> speaker_embeddings

* correct bark.mdx + add bark to documentation_tests

* correct docstrings configuration_bark

* integrate last nit suggestions

* integrate BarkGeneration configs

* make style

* remove bark tests from documentation_tests.txt because timeout - tested manually

* add proper generation config initialization

* small bark.mdx documentation changes

* rename bark.mdx -> bark.md

* add torch.no_grad behind BarkModel.generate_audio()

* replace assert by ValueError in convert_suno_to_hf.py

* integrate a series of short comments from reviewer

* move SemanticLogitsProcessors and remove .detach() from Bark docs and docstrings

* actually remove SemanticLogitsProcessor from modeling_bark.oy

* BarkProcessor returns a single output instead of tuple + correct docstrings

* make style + correct bug

* add initializer_range to BarkConfig + correct slow modeling tests

* add .clone() to history_prompt.coarse_prompt to avoid modifying input array

* Making sure no extra "`" are present

* remove extra characters in modeling_bark.py

* Correct output if history_prompt is None

* remove TODOs

* remove ravel comment

* completing generation_configuration_bark.py docstrings

* change docstrings - number of audio codebooks instead of Encodec codebooks

* change 'bias' docstrings in configuration_bark.py

* format code

* rename BarkModel.generate_audio -> BarkModel.generate_speech

* modify AutoConfig instead of EncodecConfig in BarkConfig

* correct AutoConfig wrong init

* refactor BarkModel and sub-models generate_coarse, generate_fine, generate_text_semantic

* remove SemanticLogitsProcessor and replace it with SuppressTokensLogitsProcessor

* move nb_codebook related config arguments to BarkFineConfig

* rename bark.mdx -> bark.md

* correcting BarkModelConfig from_pretrained + remove keys_to_ignore

* correct bark.md with correct hub path

* correct code bug in bark.md

* correct list tokens_to_suppress

* modify Processor to load nested speaker embeddings in a safer way

* correct batch sampling in BarkFineModel.generate_fine

* Apply suggestions from code review

Small docstrings correction and code improvements
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* give more details about num_layers in docstrings

* correct indentation mistake

* correct submodelconfig order of docstring variables

* put audio models in alphabetical order in utils/check_repo.my

* remove useless line from test_modeling_bark.py

* makes BarkCoarseModelTest inherits from (ModelTesterMixin, GenerationTesterMixin, unittest.TestCase) instead of BarkSemanticModelTest

* make a Tester class for each sub-model instead of inheriting

* add test_resize_embeddings=True for Bark sub-models

* add Copied from transformers.models.gpt_neo.modeling_gpt_neo.GPTNeoSelfAttention._split_heads

* remove 'Copied fom Bark' comment

* remove unneccessary comment

* change np.min -> min in modeling_bark.py

* refactored all custom layers to have Bark prefix

* add attention_mask as an argument of generate_text_semantic

* refactor sub-models start docstrings to have more precise config class definition

* move _tied_weights_keys overriding

* add docstrings to generate_xxx in modeling_bark.py

* add loading whole BarkModel to convert_suno_to_hf

* refactor attribute and variable names

* make style convert_suno

* update bark checkpoints

* remove never entered if statement

* move bark_modeling docstrings after BarkPretrainedModel class definition

* refactor modeling_bark.py: kv -> key_values

* small nits - code refactoring and removing unecessary lines from _init_weights

* nits - replace inplace method by variable assigning

* remove *optional* when necessary

* remove some lines in generate_speech

* add default value for optional parameter

* Refactor preprocess_histories_before_coarse -> preprocess_histories
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* correct usage after refactoring

* refactor Bark's generate_xxx -> generate and modify docstrings and tests accordingly

* update docstrings python in configuration_bark.py

* add bark files in utils/documentation_test.txt

* correct docstrings python snippet

* add the ability to use parameters in the form of e.g coarse_temperature

* add semantic_max_new_tokens in python snippet in docstrings for quicker generation

* Reformate sub-models kwargs in BakModel.generate
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* correct kwargs in BarkModel.generate

* correct attention_mask kwarg in BarkModel.generate

* add tests for sub-models args in BarkModel.generate and correct BarkFineModel.test_generate_fp16

* enrich BarkModel.generate docstrings with a description of how to use the kwargs

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

f42a35e6

Add Multimodal heading and Document question answering in task_summary.mdx (#23318) · 4f088870

Samin Yasar authored Jul 17, 2023

* add multimodal heading and docqa

* fix sentence

* task_summary data type = modality clarification

* change the multimodal example to a smaller model

4f088870

deprecate `sharded_ddp` training argument (#24825) · 8ba26c18

statelesshz authored Jul 17, 2023



* deprecate fairscale's ShardedDDP

* fix code style

* roll back

* deprecate the `sharded_ddp` training argument

---------
Co-authored-by: jihuazhong <jihuazhong1@huawei.com>

8ba26c18

13 Jul, 2023 3 commits

Remove Falcon docs for the release until TGI is ready (#24808) · c0ca73dc
Matt authored Jul 13, 2023
```
* Remove Falcon docs for the release until TGI is ready

* Update toctree
```
c0ca73dc
Fix typo 'submosules' (#24809) · f9a711df
dymil authored Jul 13, 2023

f9a711df

Deprecate models (#24787) · 9342c8fb

Sylvain Gugger authored Jul 13, 2023



* Deprecate some models

* Fix imports

* Fix inits too

* Remove tests

* Add deprecated banner to documentation

* Remove from init

* Fix auto classes

* Style

* Remote upgrade strategy 1

* Remove site package cache

* Revert this part

* Fix typo...

* Update utils

* Update docs/source/en/model_doc/bort.md
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Address review comments

* With all files saved

---------
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

9342c8fb

11 Jul, 2023 2 commits

Add ViViT (#22518) · 8a5e8a9c

Jegor Kitškerkin authored Jul 11, 2023



* Add model

* Add ability to get classification head weights

* Add docs

* Add imports to __init__.py

* Run style

* Fix imports and add mdx doc

* Run style

* Fix copyright

* Fix config docstring

* Remove imports of ViViTLayer and load_tf_weights_in_vivit

* Remove FeatureExtractor and replace with ImageProcessor everywhere

* Remove ViViTForPreTraining from vivit.mdx

* Change ViViT -> Vivit everywhere

* Add model_doc to _toctree.yml

* Replace tuples with lists in arguments of VivitConfig

* Rename patch_size to tubelet_size in TubeletEmbeddings

* Fix checkpoint names

* Add tests

* Remove unused num_frames

* Fix imports for VivitImageProcessor

* Minor fixes

* Decrease number of frames in VivitModelTester from 32 to 16

* Decrease number of frames in VivitModelTester from 16 to 8

* Add initialization for pos embeddings

* Rename Vivit -> ViViT in some places

* Fix docstring and formatting

* Rename TubeletEmbeddings -> VivitTubeletEmbeddings

* Remove load_tf_weights_in_vivit

* Change checkpoint name

* Remove Vivit _TOKENIZER_FOR_DOC

* Fix

* Fix VivitTubeletEmbeddings and pass config object as parameter

* Use image_size and num_frames instead of video_size

* Change conversion script and fix differences with the orig implementation

* Fix docstrings

* Add attention head pruning

* Run style and fixup

* Fix tests

* Add ViViT to video_classification.mdx

* Save processor in conversion script

* Fix

* Add image processor test

* Run fixup and style

* Run fix-copies

* Update tests/models/vivit/test_modeling_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/vivit/test_modeling_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vivit/modeling_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Use PyAV instead of decord

* Add unittest.skip

* Run style

* Remove unneeded test

* Update docs/source/en/model_doc/vivit.mdx
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vivit/configuration_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vivit/modeling_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vivit/image_processing_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vivit/modeling_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vivit/modeling_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vivit/image_processing_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vivit/modeling_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add model

* Add docs

* Run style

* Fix imports and add mdx doc

* Remove FeatureExtractor and replace with ImageProcessor everywhere

* Change ViViT -> Vivit everywhere

* Rename Vivit -> ViViT in some places

* Update src/transformers/models/vivit/image_processing_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Run make style

* Remove inputs save

* Fix image processor

* Fix

* Run `make style`

* Decrease parameters of VivitModelTester

* Decrease tubelet size

* Rename vivit.mdx

* Update src/transformers/models/vivit/image_processing_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vivit/image_processing_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/vivit/image_processing_vivit.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix default values in image_processing_vivit.py

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

8a5e8a9c

Falcon port (#24523) · b3ab3fac

Matt authored Jul 11, 2023



* Initial commit

* Update src/transformers/models/falcon/configuration_falcon.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/falcon/configuration_falcon.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Cleanup config docstring

* Update src/transformers/models/falcon/configuration_falcon.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Convert to relative imports

* Remove torch < 1.8 warning

* Restructure cos_sin header

* qkv -> query, key, value

* Refactor attention calculation

* Add a couple of config variables to account for the different checkpoints

* Successful merging of the code paths!

* Fix misplaced line in the non-parallel attention path

* Update config and tests

* Add a pad_token_id when testing

* Support output_attentions when alibi is None

* make fixup

* Skip KV cache shape test

* No more _keys_to_ignore_on_load_missing

* Simplify self attention a bit

* Simplify self attention a bit

* make fixup

* stash commit

* Some more attention mask updates

* Should pass all tests except assisted generation!

* Add big model generation test

* make fixup

* Add temporary workaround for test

* Test overrides for assisted generation

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/falcon/test_modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Test overrides for assisted generation

* Add generation demo

* Update copyright

* Make the docstring model actually small

* Add module-level docstring

* Remove all assertions

* Add copied from bloom

* Reformat the QKV layer

* Add copied from bloom

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove unused line and reformat

* No single letter variables

* Cleanup return names

* Add copied from line

* Remove the deprecated arguments blocks

* Change the embeddings test to an alibi on/off test

* Remove position_ids from FalconForQA

* Remove old check for token type IDs

* Fix the alibi path when multi_query is False

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/falcon/test_modeling_falcon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update config naming

* Fix typo for new_decoder_architecture

* Add some comments

* Fix docstring

* Fix docstring

* Create range in the right dtype from the start

* Review comment cleanup

* n_head_kv -> num_kv_heads

* self.alibi -> self.use_alibi

* self.num_kv -> self.num_kv_heads

* Reorder config args

* Made alibi arguments Optional

* Add all model docstrings

* Add extra checkpoints

* Add author info for Falcon

* Stop removing token_type_ids because our checkpoints shouldn't return it anymore

* Add one hopeful comment for the future

* Fix typo

* Update tests, fix cache issue for generation

* Use -1e9 instead of -inf to avoid float overflow

* Recompute the rotary embeddings much less often

* Re-enable disabled tests

* One final fix to attention mask calculation, and update tests

* Cleanup targeting falcon-40b equivalency

* Post-rebase docs update

* Update docstrings, especially in the config

* More descriptive variable names, and comments where we can't rename them

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

b3ab3fac

10 Jul, 2023 1 commit
- add link to accelerate doc (#24601) · 35eac0df
  Marc Sun authored Jul 10, 2023
  
  35eac0df