Commits · c681b58b06f6fb8b5c331f380548af3b4b33f881 · chenpangpang / transformers

26 Apr, 2024 1 commit
- Update `dtype_byte_size` to handle torch.float8_e4m3fn/float8_e5m2 types (#30488) · 20081c74
  Michael Goin authored Apr 26, 2024
```
* Update modeling_utils/dtype_byte_size to handle float8 types

* Add a test for dtype_byte_size

* Format

* Fix bool
```
  20081c74
25 Apr, 2024 1 commit
- Quantization: `HfQuantizer` quant method update (#30484) · 26ddc580
  Younes Belkada authored Apr 25, 2024
```
ensure popular quant methods are supported
```
  26ddc580
23 Apr, 2024 1 commit

fix for itemsize => element_size() for torch backwards compat (#30133) · 57fc00f3

Wing Lian authored Apr 23, 2024



* fix for itemsize => element_size() for torch backwards compat

* improve handling of element counting

* Update src/transformers/modeling_utils.py

* fixup

* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

57fc00f3

19 Apr, 2024 2 commits

Fix config + attn_implementation in AutoModelForCausalLM.from_pretrained (#30299) · 21c912e7
hoshi-hiyouga authored Apr 20, 2024
```
* Update modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py
```
21c912e7

Update unwrap from accelerate (#29933) · b4fd49b6

Marc Sun authored Apr 19, 2024



* Use unwrap with the one in accelerate

* oups

* update unwrap

* fix

* wording

* raise error instead

* comment

* doc

* Update src/transformers/modeling_utils.py
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* style

* put else

---------
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

b4fd49b6

12 Apr, 2024 1 commit
- fix: Replaced deprecated `logger.warn` with `logger.warning` (#30197) · caa5c65d
  Sai-Suraj-27 authored Apr 12, 2024
```
* Fixed deprecated logger.warn by using logger.warning

* Reformatted using ruff.
```
  caa5c65d
10 Apr, 2024 1 commit
- FIX / bnb: fix torch compatiblity issue with `itemize` (#30162) · f569172f
  Younes Belkada authored Apr 10, 2024
```
* fix torch compatiblity issues

* fix

* Update src/transformers/modeling_utils.py
```
  f569172f
09 Apr, 2024 1 commit

Fix failing DeepSpeed model zoo tests (#30112) · 4e3490f7

Sourab Mangrulkar authored Apr 09, 2024

* fix sequence length errors

* fix label column name error for vit

* fix the lm_head embedding!=linear layer mismatches for Seq2Seq models

4e3490f7

02 Apr, 2024 1 commit

Hard error when ignoring tensors. (#27484) (#29906) · 9b0a8ea7

Nicolas Patry authored Apr 02, 2024



* Hard error when ignoring tensors. (#27484)

* [WIP] Hard error when ignoring tensors.

* Better selection/error when saving a checkpoint.

- Find all names we should normally drop (those are in the transformers
  config)
- Find all disjoint tensors (for those we can safely trigger a copy to
  get rid of the sharing before saving)
- Clone those disjoint tensors getting rid of the issue
- Find all identical names (those should be declared in the config
  but we try to find them all anyway.)
- For all identical names:
  - If they are in the config, just ignore them everything is fine
  - If they are not, warn about them.
- For all remainder tensors which are shared yet neither identical NOR
  disjoint. raise a hard error.

* Adding a failing test on `main` that passes here.

* We don't need to keep the subfolder logic in this test.

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add small tests.

* Dead variable.

* Fixup.

* Fixing tied_Weights_keys on generic models.

* Fixup + T5 encoder/decoder tying (with different layers)

* Code quality.

* Dynamic member.

* trigger

* Fixing encoder name for other types of encoder/decoder combos.

* Fix scoping.

* Update .github/workflows/self-scheduled.yml
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fixing the tied_weights after the call.

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

9b0a8ea7

27 Mar, 2024 1 commit

Reimplement "Automatic safetensors conversion when lacking these files" (#29846) · 4d8427f7

Lysandre Debut authored Mar 27, 2024

* Automatic safetensors conversion when lacking these files (#29390)

* Automatic safetensors conversion when lacking these files

* Remove debug

* Thread name

* Typo

* Ensure that raises do not affect the main thread

* Catch all errors

4d8427f7

25 Mar, 2024 2 commits
- [`revert commit`] revert 00a09ed4 · e3e16ddc
  Arthur Zucker authored Mar 25, 2024
  
  e3e16ddc
- fix 😭 · 00a09ed4
  Arthur Zucker authored Mar 25, 2024
  
  00a09ed4
18 Mar, 2024 1 commit

FIX [`bnb`] Make `unexpected_keys` optional (#29420) · c852d4fb

Younes Belkada authored Mar 18, 2024



* make `unexpected_keys` optional

* push

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

c852d4fb

15 Mar, 2024 1 commit

[Quantization] Quanto quantizer (#29023) · 28de2f4d

Marc Sun authored Mar 15, 2024



* start integration

* fix

* add and debug tests

* update tests

* make pytorch serialization works

* compatible with device_map and offload

* fix tests

* make style

* add ref

* guard against safetensors

* add float8 and style

* fix is_serializable

* Fix shard_checkpoint compatibility with quanto

* more tests

* docs

* adjust memory

* better

* style

* pass tests

* Update src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* add is_safe_serialization instead

* Update src/transformers/quantizers/quantizer_quanto.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* add QbitsTensor tests

* fix tests

* simplify activation list

* Update docs/source/en/quantization.md
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>

* better comment

* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>

* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>

* find and fix edge case

* Update docs/source/en/quantization.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* pass weights_only_kwarg instead

* fix shard_checkpoint loading

* simplify update_missing_keys

* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* recursion to get all tensors

* block serialization

* skip serialization tests

* fix

* change by cuda:0 for now

* fix regression

* update device_map

* fix doc

* add noteboon

* update torch_dtype

* update doc

* typo

* typo

* remove comm

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>

28de2f4d

13 Mar, 2024 2 commits

Add support for FSDP+QLoRA and DeepSpeed ZeRO3+QLoRA (#29587) · 350c5d15

Sourab Mangrulkar authored Mar 13, 2024



* fsdp+qlora related changes

* fixes

* Update quantization_config.py

* support fsdp+qlora and dsz3+qlora

* Update quantization_config.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* handle fsdp+qlora and dsz3+qlora correctly while model loading

* fix param count

* quality

* fsdp related changes

* fsdp changes only when using LoRA/QLoRA

* add accelerate version check

* refactor, update min accelerate version and add tests

1. Update minimum accelerate version to 0.26.0
2. Clean the trainer wrt accelerate version checks
3. FSDP refactor and test for fsdp config
4. use `itemsize` instead of `dtype2bytes` dict

* fix test

* Address comments
Co-Authored-By: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* fix the conditional flag

* fix conditional flag

* address comments
Co-Authored-By: Zach Mueller <7831895+muellerzr@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Zach Mueller <7831895+muellerzr@users.noreply.github.com>

350c5d15

[PyTorch/XLA] Fix extra TPU compilations introduced by recent changes (#29158) · b340d907
Jiewen Tan authored Mar 13, 2024
```
* tmp

* Remove debug step

* Fix a typo

* Move to is_torch_xla_available
```
b340d907

11 Mar, 2024 2 commits

Experimental loading of MLX files (#29511) · b382a09e

Pedro Cuenca authored Mar 11, 2024

* Experimental loading of MLX files

* Update exception message

* Add test

* Style

* Use model from hf-internal-testing

b382a09e

Make torch xla available on GPU (#29334) · 873d9bb3

Yitong Huang authored Mar 11, 2024



* add USE_TORCH_XLA env

* rename torch_tpu to torch_xla

* better is_torch_xla_available; fix some fsdp and performance issues

* fix format

* fix bug when pjrt_device is cpu

* fix bug

* fix the deprecation handling

---------
Co-authored-by: anw90 <ang868@gmail.com>
Co-authored-by: wangang.wa <wangang.wa@alibaba-inc.com>

873d9bb3

07 Mar, 2024 2 commits
- Add support for metadata format MLX (#29335) · 45c06510
  Alex Ishida authored Mar 07, 2024
```
Add support for loading safetensors files saved with metadata format mlx.
```
  45c06510
- Revert "Automatic safetensors conversion when lacking these files (#2… (#29507) · f6133d76
  Lysandre Debut authored Mar 07, 2024
```
Revert "Automatic safetensors conversion when lacking these files (#29390)"

This reverts commit a69cbf4e.
```
  f6133d76
06 Mar, 2024 1 commit
- [FIX] `offload_weight()` takes from 3 to 4 positional arguments but 5 were given (#29457) · 00bf4427
  Fanli Lin authored Mar 06, 2024
```
* use require_torch_gpu

* enable on XPU

* fix
```
  00bf4427
05 Mar, 2024 1 commit

Automatic safetensors conversion when lacking these files (#29390) · a69cbf4e

Lysandre Debut authored Mar 05, 2024

* Automatic safetensors conversion when lacking these files

* Remove debug

* Thread name

* Typo

* Ensure that raises do not affect the main thread

a69cbf4e

01 Mar, 2024 1 commit

Expose `offload_buffers` parameter of `accelerate` to... · 5ee0868a

Song Fuchang authored Mar 01, 2024

Expose `offload_buffers` parameter of `accelerate` to `PreTrainedModel.from_pretrained` method (#28755)

Expose offload_buffers parameter to from_pretrained method

5ee0868a

27 Feb, 2024 1 commit
- Fix `attn_implementation` documentation (#29295) · 6d3b643e
  fxmarty authored Feb 27, 2024
```
fix
```
  6d3b643e
20 Feb, 2024 1 commit
- [`gradient_checkpointing`] default to use it for torch 2.3 (#28538) · 9094abe8
  Arthur authored Feb 20, 2024
```
* default to use it

* style
```
  9094abe8
16 Feb, 2024 1 commit
- Update all references to canonical models (#29001) · f497f564
  Lysandre Debut authored Feb 16, 2024
```
* Script & Manual edition

* Update
```
  f497f564
15 Feb, 2024 1 commit
- FIX: Fix error with `logger.warning` + inline with recent refactor (#29039) · 6d1f5456
  Younes Belkada authored Feb 15, 2024
```
Update modeling_utils.py
```
  6d1f5456
14 Feb, 2024 1 commit
- ENH [`AutoQuantizer`]: enhance trainer + not supported quant methods (#28991) · 164bdef8
  Younes Belkada authored Feb 14, 2024
```
* enhance trainer + not support quant methods

* remove all old logic

* add version
```
  164bdef8
12 Feb, 2024 1 commit

Always initialize tied output_embeddings if it has a bias term (#28947) · 136cd893

JB (Don) authored Feb 12, 2024

Continue to initialize tied output_embeddings if it has a bias term

The bias term is not tied, and so will need to be initialized accordingly.

136cd893

06 Feb, 2024 1 commit
- Revert "[WIP] Hard error when ignoring tensors." (#28898) · 76b4f666
  Yih-Dar authored Feb 06, 2024
```
Revert "[WIP] Hard error when ignoring tensors. (#27484)"

This reverts commit 2da28c4b.
```
  76b4f666
05 Feb, 2024 1 commit

[WIP] Hard error when ignoring tensors. (#27484) · 2da28c4b

Nicolas Patry authored Feb 05, 2024



* [WIP] Hard error when ignoring tensors.

* Better selection/error when saving a checkpoint.

- Find all names we should normally drop (those are in the transformers
  config)
- Find all disjoint tensors (for those we can safely trigger a copy to
  get rid of the sharing before saving)
- Clone those disjoint tensors getting rid of the issue
- Find all identical names (those should be declared in the config
  but we try to find them all anyway.)
- For all identical names:
  - If they are in the config, just ignore them everything is fine
  - If they are not, warn about them.
- For all remainder tensors which are shared yet neither identical NOR
  disjoint. raise a hard error.

* Adding a failing test on `main` that passes here.

* We don't need to keep the subfolder logic in this test.

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

2da28c4b

02 Feb, 2024 2 commits

Add missing None check for hf_quantizer (#28804) · ec29d25d

Juri Ganitkevitch authored Feb 02, 2024



* Add missing None check for hf_quantizer

* Add test, fix logic.

* make style

* Switch test model to Mistral

* Comment

* Update tests/test_modeling_utils.py

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

ec29d25d

[Docs] Fix spelling and grammar mistakes (#28825) · 721ee783

Klaus Hipp authored Feb 02, 2024

* Fix typos and grammar mistakes in docs and examples

* Fix typos in docstrings and comments

* Fix spelling of `tokenizer` in model tests

* Remove erroneous spaces in decorators

* Remove extra spaces in Markdown link texts

721ee783

31 Jan, 2024 1 commit

don't initialize the output embeddings if we're going to tie them to input embeddings (#28192) · ae0c27ad

tom-p-reichel authored Jan 30, 2024

* test that tied output embeddings aren't initialized on load

* don't initialize the output embeddings if we're going to tie them to the input embeddings

ae0c27ad

30 Jan, 2024 1 commit

`HfQuantizer` class for quantization-related stuff in `modeling_utils.py` (#26610) · d78e78a0

Poedator authored Jan 30, 2024



* squashed earlier commits for easier rebase

* rm rebase leftovers

* 4bit save enabled @quantizers

* TMP gptq test use exllama

* fix AwqConfigTest::test_wrong_backend for A100

* quantizers AWQ fixes

* _load_pretrained_model low_cpu_mem_usage branch

* quantizers style

* remove require_low_cpu_mem_usage attr

* rm dtype arg from process_model_before_weight_loading

* rm config_origin from Q-config

* rm inspect from q_config

* fixed docstrings in QuantizationConfigParser

* logger.warning fix

* mv is_loaded_in_4(8)bit to BnbHFQuantizer

* is_accelerate_available error msg fix in quantizer

* split is_model_trainable in bnb quantizer class

* rm llm_int8_skip_modules as separate var in Q

* Q rm todo

* fwd ref to HFQuantizer in type hint

* rm note re optimum.gptq.GPTQQuantizer

* quantization_config in __init__ simplified

* replaced NonImplemented with  create_quantized_param

* rm load_in_4/8_bit deprecation warning

* QuantizationConfigParser refactoring

* awq-related minor changes

* awq-related changes

* awq config.modules_to_not_convert

* raise error if no q-method in q-config in args

* minor cleanup

* awq quantizer docstring

* combine common parts in bnb process_model_before_weight_loading

* revert test_gptq

* .process_model_ cleanup

* restore dict config warning

* removed typevars in quantizers.py

* cleanup post-rebase 16 jan

* QuantizationConfigParser classmethod refactor

* rework of handling of unexpected aux elements of bnb weights

* moved q-related stuff from save_pretrained to quantizers

* refactor v1

* more changes

* fix some tests

* remove it from main init

* ooops

* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix awq issues

* fix

* fix

* fix

* fix

* fix

* fix

* add docs

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/hf_quantizer.md

* address comments

* fix

* fixup

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address final comment

* update

* Update src/transformers/quantizers/base.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/quantizers/auto.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* add kwargs update

* fixup

* add `optimum_quantizer` attribute

* oops

* rm unneeded file

* fix doctests

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

d78e78a0

26 Jan, 2024 4 commits
- fix: suppress `GatedRepoError` to use cache file (fix #28558). (#28566) · 3aea38ce
  Scruel Tao authored Jan 27, 2024
```
* fix: suppress `GatedRepoError` to use cache file (fix #28558).

* move condition_to_return parameter back to outside.
```
  3aea38ce
- Fix `weights_only` (#28725) · a638de19
  Yih-Dar authored Jan 26, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  a638de19
- fix: corrected misleading log message in save_pretrained function (#28699) · 1f47a24a
  Turetskii Mikhail authored Jan 26, 2024
  
  1f47a24a
- Fix duplicate & unnecessary flash attention warnings (#28557) · 8eb74c1c
  fxmarty authored Jan 26, 2024
```
* fix duplicate & unnecessary flash warnings

* trigger ci

* warning_once

* if/else order

---------
Co-authored-by: Your Name <you@example.com>
```
  8eb74c1c
18 Jan, 2024 1 commit

Use `weights_only` only if torch >= 1.13 (#28506) · a1668cc7

Yih-Dar authored Jan 18, 2024



* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

a1668cc7