Commits · f497f564bb76697edab09184a252fc1b1a326d1e · chenpangpang / transformers

16 Feb, 2024 1 commit
- Update all references to canonical models (#29001) · f497f564
  Lysandre Debut authored Feb 16, 2024
```
* Script & Manual edition

* Update
```
  f497f564
14 Feb, 2024 5 commits

Mask Generation Task Guide (#28897) · 3f4e79d2

Merve Noyan authored Feb 14, 2024



* Create mask_generation.md

* add h1

* add to toctree

* Update docs/source/en/tasks/mask_generation.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update mask_generation.md

* Update docs/source/en/tasks/mask_generation.md
Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md
Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tas...

3f4e79d2

[`Doc`] Fix docbuilder - make `BackboneMixin` and `BackboneConfigMixin`... · 7252e8d9

amyeroberts authored Feb 14, 2024

[`Doc`] Fix docbuilder - make `BackboneMixin` and `BackboneConfigMixin` importable from `utils`.  (#29002)

* Trigger doc build

* Test removing references

* Importable from utils

* Trigger another run on a new commit for testing

7252e8d9

AQLM quantizer support (#28928) · 1ecf5f7c

Andrei Panferov authored Feb 14, 2024



* aqlm init

* calibration and dtypes

* docs

* Readme update

* is_aqlm_available

* Simpler link in docs

* Test TODO real reference

* init _import_structure fix

* AqlmConfig autodoc

* integration aqlm

* integrations in tests

* docstring fix

* legacy typing

* Less typings

* More kernels information

* Performance -> Accuracy

* correct tests

* remoced multi-gpu test

* Update docs/source/en/quantization.md
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/utils/quantization_config.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Brought back multi-gpu tests

* Update src/transformers/integrations/aqlm.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update tests/quantization/aqlm_integration/test_aqlm.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-autho...

1ecf5f7c

Add SiglipForImageClassification and CLIPForImageClassification (#28952) · 63ffd56d
NielsRogge authored Feb 14, 2024
```
* First draft

* Add CLIPForImageClassification

* Remove scripts

* Fix doctests
```
63ffd56d

Add `StableLM` (#28810) · de6029a0

Jonathan Tow authored Feb 14, 2024

* Add `StableLM`

* fix(model): re-create from `huggingface-cli add-new-model-like persimmon`

* fix: re-add changes to address comments

* fix(readme): add links to paper

* fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref

* fix(tests): re-add `@slow` decorator to integration tests

* fix(tests): import slow...

* fix(readme_hd): remove whitespace edit

* fix(tokenizer): auto tokenizer tuple

* skip doctests for `modeling_stablelm`

de6029a0

12 Feb, 2024 4 commits
- [i18n-de] Translate CONTRIBUTING.md to German (#28954) · d90acc16
  Klaus Hipp authored Feb 12, 2024
```
* Translate contributing.md to German

* Fix formatting issues in contributing.md

* Address review comments

* Fix capitalization
```
  d90acc16
- [Docs] Add video section (#28958) · 78ba9f46
  NielsRogge authored Feb 12, 2024
```
Add video section
```
  78ba9f46
- [Docs] Add language identifiers to fenced code blocks (#28955) · fe3df9d5
  Klaus Hipp authored Feb 12, 2024
```
Add language identifiers to code blocks
```
  fe3df9d5
- [Docs] Update README and default pipelines (#28864) · ef5ab72f
  NielsRogge authored Feb 12, 2024
```
* Update README and docs

* Update README

* Update README
```
  ef5ab72f
08 Feb, 2024 4 commits

[Docs] Fix broken links and syntax issues (#28918) · 2749e479

Klaus Hipp authored Feb 08, 2024

* Fix model documentation links in attention.md

* Fix external link syntax

* Fix target anchor names of section links

* Fix copyright statement comments

* Fix documentation headings

2749e479

[`Core generation`] Adds support for static KV cache (#27931) · 115ac94d

Arthur authored Feb 08, 2024

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

115ac94d

[Docs] Revert translation of '@slow' decorator (#28912) · 33df0369
Klaus Hipp authored Feb 08, 2024

33df0369
[Docs] Fix placement of tilde character (#28913) · 328ade85
Klaus Hipp authored Feb 08, 2024
```
Fix placement of tilde character
```
328ade85

06 Feb, 2024 3 commits

[Docs] Add missing language options and fix broken links (#28852) · 1c31b7aa

Klaus Hipp authored Feb 06, 2024

* Add missing entries to the language selector

* Add links to the Colab and AWS Studio notebooks for ONNX

* Use anchor links in CONTRIBUTING.md

* Fix broken hyperlinks due to spaces

* Fix links to OpenAI research articles

* Remove confusing footnote symbols from author names, as they are also considered invalid markup

1c31b7aa

[Docs] Fix backticks in inline code and documentation links (#28875) · 4830f269
Klaus Hipp authored Feb 06, 2024
```
Fix backticks in code blocks and documentation links
```
4830f269

Adds LlamaForQuestionAnswering class in modeling_llama.py along with AutoModel Support (#28777) · 2e7c942c

nakranivaibhav authored Feb 06, 2024

* This is a test commit

* testing commit

* final commit with some changes

* Removed copy statement

* Fixed formatting issues

* Fixed error added past_key_values in the forward method

* Fixed a trailing whitespace. Damn the formatting rules are strict

* Added the copy statement

2e7c942c

05 Feb, 2024 1 commit

Image Feature Extraction pipeline (#28216) · ba3264b4

amyeroberts authored Feb 05, 2024



* Draft pipeline

* Fixup

* Fix docstrings

* Update doctest

* Update pipeline_model_mapping

* Update docstring

* Update tests

* Update src/transformers/pipelines/image_feature_extraction.py
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Fix docstrings - review comments

* Remove pipeline mapping for composite vision models

* Add to pipeline tests

* Remove for flava (multimodal)

* safe pil import

* Add requirements for pipeline run

* Account for super slow efficientnet

* Review comments

* Fix tests

* Swap order of kwargs

* Use build_pipeline_init_args

* Add back FE pipeline for Vilt

* Include image_processor_kwargs in docstring

* Mark test as flaky

* Update TODO

* Update tests/pipelines/test_pipelines_image_feature_extraction.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add license header

---------
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

ba3264b4

02 Feb, 2024 2 commits

[Docs] Fix spelling and grammar mistakes (#28825) · 721ee783

Klaus Hipp authored Feb 02, 2024

* Fix typos and grammar mistakes in docs and examples

* Fix typos in docstrings and comments

* Fix spelling of `tokenizer` in model tests

* Remove erroneous spaces in decorators

* Remove extra spaces in Markdown link texts

721ee783

[docs] HfQuantizer (#28820) · 2418c64a
Steven Liu authored Feb 01, 2024
```
* tidy

* fix path
```
2418c64a

01 Feb, 2024 4 commits

[docs] Backbone (#28739) · abbffc45
Steven Liu authored Feb 01, 2024
```
* backbones

* fix path

* fix paths

* fix code snippet

* fix links
```
abbffc45

Add models from deit (#28302) · 23ea6743

Rockerz authored Feb 01, 2024



* Add modelss

* Add 2 more models

* add models to tocrree

* Add modles

* Update docs/source/ja/model_doc/detr.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/model_doc/deit.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/model_doc/deplot.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix bugs

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

23ea6743

Add tip on setting tokenizer attributes (#28764) · 7bc6d763

Matt authored Feb 01, 2024

* Add tip on setting tokenizer attributes

* Grammar

* Remove the bit that was causing doc builds to fail

7bc6d763

Adding [T5/MT5/UMT5]ForTokenClassification (#28443) · 0d26abdd

JB (Don) authored Feb 01, 2024

* Adding [T5/MT5/UMT5]ForTokenClassification

* Add auto mappings for T5ForTokenClassification and variants

* Adding ForTokenClassification to the list of models

* Adding attention_mask param to the T5ForTokenClassification test

* Remove outdated comment in test

* Adding EncoderOnly and Token Classification tests for MT5 and UMT5

* Fix typo in umt5 string

* Add tests for all the existing MT5 models

* Fix wrong comment in dependency_versions_table

* Reverting change to common test for _keys_to_ignore_on_load_missing

The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing.

* Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model

* Add fix-copies to MT5ModelTest

0d26abdd

31 Jan, 2024 1 commit

Flax mistral (#26943) · f7076cd3

Kian Sierra McGettigan authored Jan 31, 2024

* direct copy from llama work

* mistral modules forward pass working

* flax mistral forward pass with sliding window

* added tests

* added layer collection approach

* Revert "added layer collection approach"

This reverts commit 0e2905bf2236ec323163fc1a9f0c016b21aa8b8f.

* Revert "Revert "added layer collection approach""

This reverts commit fb17b6187ac5d16da7c461e1130514dc3d137a43.

* fixed attention outputs

* added mistral to init and auto

* fixed import name

* fixed layernorm weight dtype

* freeze initialized weights

* make sure conversion consideres bfloat16

* added backend

* added docstrings

* added cache

* fixed sliding window causal mask

* passes cache tests

* passed all tests

* applied make style

* removed commented out code

* applied fix-copies ignored other model changes

* applied make fix-copies

* removed unused functions

* passed generation integration test

* slow tests pass

* fixed slow tests

* changed default dtype from jax.numpy.float32 to float32 for docstring check

* skip cache test  for FlaxMistralForSequenceClassification since if pad_token_id in input_ids it doesn't score previous input_ids

* updated checkpoint since from_pt not included

* applied black style

* removed unused args

* Applied styling and fixup

* changed checkpoint for doc back

* fixed rf after adding it to hf hub

* Add dummy ckpt

* applied styling

* added tokenizer to new ckpt

* fixed slice format

* fix init and slice

* changed ref for placeholder TODO

* added copies from Llama

* applied styling

* applied fix-copies

* fixed docs

* update weight dtype reconversion for sharded weights

* removed Nullable input ids

* Removed unnecessary output attentions in Module

* added embedding weight initialziation

* removed unused past_key_values

* fixed deterministic

* Fixed RMS Norm and added copied from

* removed input_embeds

* applied make style

* removed nullable input ids from sequence classification model

* added copied from GPTJ

* added copied from Llama on FlaxMistralDecoderLayer

* added copied from to FlaxMistralPreTrainedModel methods

* fix test deprecation warning

* freeze gpt neox random_params and fix copies

* applied make style

* fixed doc issue

* skipped docstring test to allign # copied from

* applied make style

* removed FlaxMistralForSequenceClassification

* removed unused padding_idx

* removed more sequence classification

* removed sequence classification

* applied styling and consistency

* added copied from in tests

* removed sequence classification test logic

* applied styling

* applied make style

* removed freeze and fixed copies

* undo test change

* changed repeat_kv to tile

* fixed to key value groups

* updated copyright year

* split casual_mask

* empty to rerun failed pt_flax_equivalence test FlaxWav2Vec2ModelTest

* went back to 2023 for tests_pr_documentation_tests

* went back to 2024

* changed tile to repeat

* applied make style

* empty for retry on Wav2Vec2

f7076cd3

30 Jan, 2024 3 commits

Add tf_keras imports to prepare for Keras 3 (#28588) · 415e9a09

Matt authored Jan 30, 2024

* Port core files + ESM (because ESM code is odd)

* Search-replace in modelling code

* Fix up transfo_xl as well

* Fix other core files + tests (still need to add correct import to tests)

* Fix cookiecutter

* make fixup, fix imports in some more core files

* Auto-add imports to tests

* Cleanup, add imports to sagemaker tests

* Use correct exception for importing tf_keras

* Fixes in modeling_tf_utils

* make fixup

* Correct version parsing code

* Ensure the pipeline tests correctly revert to float32 after each test

* Ensure the pipeline tests correctly revert to float32 after each test

* More tf.keras -> keras

* Add dtype cast

* Better imports of tf_keras

* Add a cast for tf.assign, just in case

* Fix callback imports

415e9a09

[`HfQuantizer`] Move it to "Developper guides" (#28768) · 866253f8
Younes Belkada authored Jan 30, 2024
```
Update _toctree.yml
```
866253f8

`HfQuantizer` class for quantization-related stuff in `modeling_utils.py` (#26610) · d78e78a0

Poedator authored Jan 30, 2024



* squashed earlier commits for easier rebase

* rm rebase leftovers

* 4bit save enabled @quantizers

* TMP gptq test use exllama

* fix AwqConfigTest::test_wrong_backend for A100

* quantizers AWQ fixes

* _load_pretrained_model low_cpu_mem_usage branch

* quantizers style

* remove require_low_cpu_mem_usage attr

* rm dtype arg from process_model_before_weight_loading

* rm config_origin from Q-config

* rm inspect from q_config

* fixed docstrings in QuantizationConfigParser

* logger.warning fix

* mv is_loaded_in_4(8)bit to BnbHFQuantizer

* is_accelerate_available error msg fix in quantizer

* split is_model_trainable in bnb quantizer class

* rm llm_int8_skip_modules as separate var in Q

* Q rm todo

* fwd ref to HFQuantizer in type hint

* rm note re optimum.gptq.GPTQQuantizer

* quantization_config in __init__ simplified

* replaced NonImplemented with  create_quantized_param

* rm load_in_4/8_bit deprecation warning

* QuantizationConfigParser refactoring

* awq-related minor changes

* awq-related changes

* awq config.modules_to_not_convert

* raise error if no q-method in q-config in args

* minor cleanup

* awq quantizer docstring

* combine common parts in bnb process_model_before_weight_loading

* revert test_gptq

* .process_model_ cleanup

* restore dict config warning

* removed typevars in quantizers.py

* cleanup post-rebase 16 jan

* QuantizationConfigParser classmethod refactor

* rework of handling of unexpected aux elements of bnb weights

* moved q-related stuff from save_pretrained to quantizers

* refactor v1

* more changes

* fix some tests

* remove it from main init

* ooops

* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix awq issues

* fix

* fix

* fix

* fix

* fix

* fix

* add docs

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/hf_quantizer.md

* address comments

* fix

* fixup

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address final comment

* update

* Update src/transformers/quantizers/base.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/quantizers/auto.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* add kwargs update

* fixup

* add `optimum_quantizer` attribute

* oops

* rm unneeded file

* fix doctests

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

d78e78a0

29 Jan, 2024 3 commits
- [Whisper] Make tokenizer normalization public (#28136) · da3c79b2
  Sanchit Gandhi authored Jan 29, 2024
```
* [Whisper] Make tokenizer normalization public

* add to docs
```
  da3c79b2
- small doc update for CamemBERT (#28644) · 26aa03a2
  Julien Chaumond authored Jan 29, 2024
  
  26aa03a2
- [Docs] Fix Typo in English & Japanese CLIP Model Documentation (TMBD -> TMDB) (#28751) · 3a08cc48
  Vinyzu authored Jan 29, 2024
```
* [Docs] Fix Typo in English CLIP model_doc

* [Docs] Fix Typo in Japanese CLIP model_doc
```
  3a08cc48
26 Jan, 2024 2 commits
- [docs] Fix datasets in guides (#28715) · abe0289e
  Steven Liu authored Jan 26, 2024
```
* change datasets

* fix
```
  abe0289e
- [`docs`] Update preprocessing.md (#28719) · 3a46e30d
  D authored Jan 26, 2024
```
* Update preprocessing.md

adjust ImageProcessor link to working target (same as in lower section of file)

* Update preprocessing.md
```
  3a46e30d
25 Jan, 2024 4 commits

[`docs`] Improve visualization for vertical parallelism (#28583) · 28751958

Peter Götz authored Jan 25, 2024

The documentation says "We refer to this Model parallelism as “Vertical” because of how models are typically visualized.", but then visualizes the model horizontally. This change visualizes the model indeed vertically.

28751958

Update question_answering.md (#28694) · 24f1a00e

Yusuf authored Jan 25, 2024

fix typo:

from:

 "model = TFAutoModelForQuestionAnswering("distilbert-base-uncased")"

to:
model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert-base-uncased")

24f1a00e

Improve Backbone API docs (#28666) · 20000956
Merve Noyan authored Jan 25, 2024
```
Update backbones.md
```
20000956

Add Depth Anything (#28654) · 963db81a

NielsRogge authored Jan 25, 2024

* First draft

* More improvements

* More improvements

* More improvements

* More improvements

* Add docs

* Remove file

* Add copied from

* Address comments

* Address comments

* Address comments

* Fix style

* Update docs

* Convert all checkpoints, add integration test

* Rename checkpoints

* Add pretrained backbone attributes

* Fix default config

* Address comment

* Add figure to docs

* Fix bug thanks to @xenova

* Update conversion script

* Fix integration test

963db81a

24 Jan, 2024 3 commits

[docs] Fix doc format (#28684) · f40b87de
Steven Liu authored Jan 24, 2024
```
* fix hfoptions

* revert changes to other files

* fix
```
f40b87de

improve efficient training on CPU documentation (#28646) · 8278b153

Fanli Lin authored Jan 25, 2024



* update doc

* revert

* typo fix

* refine

* add dtypes

* Update docs/source/en/perf_train_cpu.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* no comma

* use avx512-vnni

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

8278b153

[docs] DeepSpeed (#28542) · 738ec75c

Steven Liu authored Jan 24, 2024

* config

* optim

* pre deploy

* deploy

* save weights, memory, troubleshoot, non-Trainer

* done

738ec75c