Commits · 4f7806ef7e6d717485a514c29fe60f09871a58f8 · chenpangpang / transformers

21 Dec, 2023 1 commit

[bnb] Let's make serialization of 4bit models possible (#26037) · 4f7806ef

Poedator authored Dec 21, 2023



* updated bitsandbytes.py

* rm test_raise_* from test_4bit.py

* add test_4bit_serialization.py

* modeling_utils bulk edits

* bnb_ver 0.41.3 in integrations/bitsandbytes.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* @slow reinstated
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* bnb ver 0.41.3 in  src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* rm bnb version todo in  integrations/bitsandbytes.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* moved 4b serialization tests to test_4bit

* tests upd for opt

* to torch_device
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* ruff fixes to tests

* rm redundant bnb version check in mod_utils
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* restore _hf_peft_config_loaded  modeling_utils.py::2188
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* restore _hf_peft_config_loaded  test in modeling_utils.py::2199
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* fixed NOT getattr(self, "is_8bit_serializable")
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* setting model.is_4bit_serializable

* rm separate fp16_statistics arg from set_module...

* rm else branch in integrations::bnb::set_module

* bnb 4bit dtype check

* upd comment on 4bit weights

* upd tests for FP4 safe

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

4f7806ef

16 Oct, 2023 1 commit

🚨

[`Quantization`] Store the original dtype in the config as a private attribute

🚨

(#26761) · fd6a0ade

Younes Belkada authored Oct 16, 2023



* First step

* fix

* add adjustements for gptq

* change to `_pre_quantization_dtype`

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix serialization

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

fd6a0ade

13 Sep, 2023 2 commits
- [`RWKV`] Final fix RWMV 4bit (#26134) · 7ccac73f
  Younes Belkada authored Sep 13, 2023
```
* Final fix RWMV 4bit

* fixup

* add a test

* add more clarifications
```
  7ccac73f
- [`core`] fix 4bit `num_parameters` (#26132) · c8b26096
  Younes Belkada authored Sep 13, 2023
```
* fix 4bit `num_parameters`

* stronger check
```
  c8b26096
10 Aug, 2023 1 commit

GPTQ integration (#25062) · 55db70c6

Marc Sun authored Aug 10, 2023



* GTPQ integration

* Add tests for gptq

* support for more quantization model

* fix style

* typo

* fix method

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add dataclass and fix quantization_method

* fix doc

* Update tests/quantization/gptq/test_gptq.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* modify dataclass

* add gtpqconfig import

* fix typo

* fix tests

* remove dataset as req arg

* remove tokenizer import

* add offload cpu quantization test

* fix check dataset

* modify dockerfile

* protect trainer

* style

* test for config

* add more log

* overwrite torch_dtype

* draft doc

* modify quantization_config docstring

* fix class name in docstring

* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* more warning

* fix 8bit kwargs tests

* peft compatibility

* remove var

* fix is_gptq_quantized

* remove is_gptq_quantized

* fix wrap

* Update src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* add exllama

* skip test

* overwrite float16

* style

* fix skip test

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix docsting formatting

* add doc

* better test

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

55db70c6

28 Jun, 2023 2 commits

⚠️ Time to say goodbye to py37 (#24091) · e84bf1f7
Yih-Dar authored Jun 28, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
e84bf1f7

Add bitsandbytes support for gpt2 models (#24504) · 12240925

Dario Sučić authored Jun 28, 2023



* Add bitsandbytes support for gpt2 models

* Guard Conv1D import to pass tensorflow test

* Appease ruff linter

* Fix 4bit test and remove int8 test boilerplate

* Update tests/bnb/test_mixed_int8.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

12240925

10 Jun, 2023 1 commit
- [tests] fix bitsandbytes import issue (#24151) · 0d217f42
  Stas Bekman authored Jun 09, 2023
```
fix bitsandbytes import issue
```
  0d217f42
09 Jun, 2023 1 commit

[`bnb`] Fix bnb config json serialization (#24137) · a6d05d55

Younes Belkada authored Jun 09, 2023



* fix bnb config json serialization

* forward contrib credits from discussions

---------
Co-authored-by: Andrechang <Andrechang@users.noreply.github.com>

a6d05d55

01 Jun, 2023 1 commit

Modify device_map behavior when loading a model using from_pretrained (#23922) · e03a9cc0

Marc Sun authored Jun 01, 2023



* Modify device map behavior for 4/8 bits model

* Remove device_map arg for training 4/8 bit model

* Remove index
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add Exceptions

* Modify comment
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix formatting

* Get current device with accelerate

* Revert "Get current device with accelerate"

This reverts commit 46f00799103bbe15bd58762ba029aab35363c4f7.

* Fix Exception

* Modify quantization doc

* Fix error
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

e03a9cc0

24 May, 2023 1 commit

4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479) · 9d73b922

Tim Dettmers authored May 24, 2023



* Added lion and paged optimizers and made original tests pass.

* Added tests for paged and lion optimizers.

* Added and fixed optimizer tests.

* Style and quality checks.

* Initial draft. Some tests fail.

* Fixed dtype bug.

* Fixed bug caused by torch_dtype='auto'.

* All test green for 8-bit and 4-bit layers.

* Added fix for fp32 layer norms and bf16 compute in LLaMA.

* Initial draft. Some tests fail.

* Fixed dtype bug.

* Fixed bug caused by torch_dtype='auto'.

* All test green for 8-bit and 4-bit layers.

* Added lion and paged optimizers and made original tests pass.

* Added tests for paged and lion optimizers.

* Added and fixed optimizer tests.

* Style and quality checks.

* Fixing issues for PR #23479.

* Added fix for fp32 layer norms and bf16 compute in LLaMA.

* Reverted variable name change.

* Initial draft. Some tests fail.

* Fixed dtype bug.

* Fixed bug caused by torch_dtype='auto'.

* All test green for 8-bit and 4-bit layers.

* Added lion and paged optimizers and made original tests pass.

* Added tests for paged and lion optimizers.

* Added and fixed optimizer tests.

* Style and quality checks.

* Added missing tests.

* Fixup changes.

* Added fixup changes.

* Missed some variables to rename.

* revert trainer tests

* revert test trainer

* another revert

* fix tests and safety checkers

* protect import

* simplify a bit

* Update src/transformers/trainer.py

* few fixes

* add warning

* replace with `load_in_kbit = load_in_4bit or load_in_8bit`

* fix test

* fix tests

* this time fix tests

* safety checker

* add docs

* revert torch_dtype

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* multiple fixes

* update docs

* version checks and multiple fixes

* replace `is_loaded_in_kbit`

* replace `load_in_kbit`

* change methods names

* better checks

* oops

* oops

* address final comments

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

9d73b922

22 May, 2023 1 commit

Fix accelerate logger bug (#23650) · 7bbdfd7b

Younes Belkada authored May 22, 2023



* fix logger bug

* Update tests/mixed_int8/test_mixed_int8.py
Co-authored-by: Zachary Mueller <muellerzr@gmail.com>

* import `PartialState`

---------
Co-authored-by: Zachary Mueller <muellerzr@gmail.com>

7bbdfd7b

12 Apr, 2023 1 commit

[`bnb`] Let's make serialization of int8 models possible (#22177) · 370f0ca1

Younes Belkada authored Apr 12, 2023



* make serialization of int8 models possible

* make fixup

* add docs

* add ability to push to hub and save pretrained

* fixes

* more addition

* more tests

* fix issues

* change variable

* clearer message

* adapt from suggestions

* few fixes

* remove unused function

* Update src/transformers/utils/quantization_config.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address last comments

* last warning

* clarify doc

* protect import

* Update src/transformers/modeling_utils.py

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

370f0ca1

29 Mar, 2023 1 commit
- [`bnb`] fix bnb failing test (#22439) · 33f4cb10
  Younes Belkada authored Mar 29, 2023
```
* fix bnb failing test

* fix

* fix

* fixup
```
  33f4cb10
20 Feb, 2023 1 commit
- [`bnb`] fix `bnb` decoders bug (#21688) · c9a06714
  Younes Belkada authored Feb 20, 2023
```
* fix `bnb` decoders bug

* make fixup
```
  c9a06714
17 Feb, 2023 1 commit

[`bnb`] Introducing `BitsAndBytesConfig` (#21579) · 3668ec17

Younes Belkada authored Feb 17, 2023



* v1 `BitsandbytesConfig`

- add v1
- add tests
- more user-friendly API
- add docs

* change to `BitsAndBytesConfig`

* replace logic

* changes

* make fixup

* quality

* make fixup

* fix doc

* fix test

* update toctree

* fix slow test

* add tips

* add warning

* change title

* oops

* Update docs/source/en/main_classes/quantization.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/utils/bitsandbytes.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove unused file

* adapt suggestion

- add also tests
- change logic

* update docs

* adapt suggestions

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3668ec17

13 Feb, 2023 1 commit
- [`bnb`] Let's make the daily CI green 🍏 (#21597) · 1666c42f
  Younes Belkada authored Feb 13, 2023
```
* fix bnb slow test

* make fixup
```
  1666c42f
02 Feb, 2023 1 commit

[`bnb`] Fine-tuning HF 8-bit models (#21290) · 8298e4ec

Younes Belkada authored Feb 02, 2023



* force `memory_efficient_backward=True`

* enhancements

- trainer support
- add new flag

* some changes

- internal changes in `Trainer`
- small refactor

* make quality

* Fixes

- add new testing util
- add new test
- change test in Trainer

* fix CI test

* educate users on how to ft 8bit models

* more checks

* fix `logger` error

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* adapt from review

* fix

* add comment

* use return instead

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

8298e4ec

24 Jan, 2023 1 commit

[`t5`] Fix T5 inference in `float16` + `bnb` error (#21281) · e2e393c6

Younes Belkada authored Jan 24, 2023

* attempts to fix:

- upcast input for `T5DenseActDense`
- add the condition `self.wo.weight.dtype != torch.int8`
- added tests on `test/mixed_int8`
- `make fixup`

* fix ci test

e2e393c6

13 Dec, 2022 1 commit

Add `keep_in_fp32_modules` support (#20683) · 1af4bee8

Younes Belkada authored Dec 13, 2022



* add `keep_in_fp32_modules` support

* pass it as class attribute

* few modifs

- make tests `slow`
- fix logic

* better logic

* fix failing test

* `bfloat16` support

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix

* simplify tests

* simplify tests

* fix test

* modify message

* more checks

* fix failing tests

* add more conditions

- add `is_accelerate_available`
- fixes pipleine tests that failed

* add suggestions

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix failing `bnb` test

* add last safety checker
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1af4bee8

23 Nov, 2022 1 commit

[BNB] Throw `ValueError` when trying to cast or assign (#20409) · ad654e44

Younes Belkada authored Nov 23, 2022



* `bnb` ValueError when tries to cast or assign

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove docstrings

* change error log
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ad654e44

17 Nov, 2022 2 commits

refactor test (#20300) · 4bb07647
Younes Belkada authored Nov 17, 2022
```
- simplifies the devce checking test
```
4bb07647

[bnb] Let's warn users when saving 8-bit models (#20282) · 7d65efec

Younes Belkada authored Nov 17, 2022



* add warning on 8-bit models

- added tests
- added wrapper

* move to a private attribute

- remove wrapper
- changed `save_pretrained` method

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7d65efec

12 Aug, 2022 1 commit

Supporting seq2seq models for `bitsandbytes` integration (#18579) · a5ca56ff

Younes Belkada authored Aug 12, 2022

* Supporting seq2seq models for `bitsandbytes` integration

- `bitsandbytes` integration supports now seq2seq models
- check if a model has tied weights as an additional check

* small modification

- tie the weights before looking at tied weights!

a5ca56ff

10 Aug, 2022 1 commit

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) · 4a51075a

Younes Belkada authored Aug 10, 2022



* first commit

* correct replace function

* add final changes

- works like charm!
- cannot implement tests yet
- tested

* clean up a bit

* add bitsandbytes dependencies

* working version

- added import function
- added bitsandbytes utils file

* small fix

* small fix

- fix import issue

* fix import issues

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit

- move bitsandbytes utils to utils
- change comments on functions

* reformat docstring

- reformat docstring on init_empty_weights_8bit

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert bad formatting

* change to bitsandbytes

* refactor a bit

- remove init8bit since it is useless

* more refactoring

- fixed init empty weights issue
- added threshold param

* small hack to make it work

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* revmoe the small hack

* modify utils file

* make style + refactor a bit

* create correctly device map

* add correct dtype for device map creation

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

- remove with torch.grad
- do not rely on Python bool magic!

* add docstring

 - add docstring for new kwargs

* add docstring

- comment `replace_8bit_linear` function
- fix weird formatting

* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add

* few modifs

- typo doc
- force cast into float16 when load_in_8bit is enabled

* added colab link

* add test architecture + docstring a bit

* refactor a bit testing class

* make style + refactor a bit

* enhance checks

- add more checks
- start writing saving test

* clean up a bit

* male style

* add more details on doc

* add more tests

- still needs to fix 2 tests

* replace by "or"

- could not fix it from GitHub GUI
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit testing code + add readme

* make style

* fix import issue

* Update src/transformers/modeling_utils.py
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* add few comments

* add more doctring + make style

* more docstring

* raise error when loaded in 8bit

* make style

* add warning if loaded on CPU

* add small sanity check

* fix small comment

* add bitsandbytes on dockerfile

* Improve documentation

- improve documentation from comments

* add few comments

* slow tests pass on the VM but not on the CI VM

* Fix merge conflict

* make style

* another test should pass on a multi gpu setup

* fix bad import in testing file

* Fix slow tests

- remove dummy batches
- no more CUDA illegal memory errors

* odify dockerfile

* Update docs/source/en/main_classes/model.mdx

* Update Dockerfile

* Update model.mdx

* Update Dockerfile

* Apply suggestions from code review

* few modifications

- lm head can stay on disk/cpu
- change model name so that test pass

* change test value

- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging

* modify installation guidelines

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace `n`by `name`

* merge `load_in_8bit` and `low_cpu_mem_usage`

* first try - keep the lm head in full precision

* better check

- check the attribute `base_model_prefix` instead of computing the number of parameters

* added more tests

* Update src/transformers/utils/bitsandbytes.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers

 into integration-8bit

* improve documentation

- fix typos for installation
- change title in the documentation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

4a51075a