- 19 Feb, 2024 2 commits
-
-
Titus authored
* generated text on A10G * generated text in CI * Apply suggestions from code review add explanatory comments Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Younes Belkada authored
Update test_mixed_int8.py
-
- 16 Feb, 2024 1 commit
-
-
Lysandre Debut authored
* Script & Manual edition * Update
-
- 12 Feb, 2024 1 commit
-
-
Klaus Hipp authored
Add language identifiers to code blocks
-
- 30 Jan, 2024 2 commits
-
-
Poedator authored
* squashed earlier commits for easier rebase * rm rebase leftovers * 4bit save enabled @quantizers * TMP gptq test use exllama * fix AwqConfigTest::test_wrong_backend for A100 * quantizers AWQ fixes * _load_pretrained_model low_cpu_mem_usage branch * quantizers style * remove require_low_cpu_mem_usage attr * rm dtype arg from process_model_before_weight_loading * rm config_origin from Q-config * rm inspect from q_config * fixed docstrings in QuantizationConfigParser * logger.warning fix * mv is_loaded_in_4(8)bit to BnbHFQuantizer * is_accelerate_available error msg fix in quantizer * split is_model_trainable in bnb quantizer class * rm llm_int8_skip_modules as separate var in Q * Q rm todo * fwd ref to HFQuantizer in type hint * rm note re optimum.gptq.GPTQQuantizer * quantization_config in __init__ simplified * replaced NonImplemented with create_quantized_param * rm load_in_4/8_bit deprecation warning * QuantizationConfigParser refactoring * awq-related minor changes * awq-related changes * awq config.modules_to_not_convert * raise error if no q-method in q-config in args * minor cleanup * awq quantizer docstring * combine common parts in bnb process_model_before_weight_loading * revert test_gptq * .process_model_ cleanup * restore dict config warning * removed typevars in quantizers.py * cleanup post-rebase 16 jan * QuantizationConfigParser classmethod refactor * rework of handling of unexpected aux elements of bnb weights * moved q-related stuff from save_pretrained to quantizers * refactor v1 * more changes * fix some tests * remove it from main init * ooops * Apply suggestions from code review Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix awq issues * fix * fix * fix * fix * fix * fix * add docs * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/hf_quantizer.md * address comments * fix * fixup * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * address final comment * update * Update src/transformers/quantizers/base.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * add kwargs update * fixup * add `optimum_quantizer` attribute * oops * rm unneeded file * fix doctests --------- Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Omar Sanseviero authored
* Update quantization_config.py * Style * Protect from setting directly * add tests * Update tests/quantization/bnb/test_4bit.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
- 21 Dec, 2023 1 commit
-
-
Poedator authored
* updated bitsandbytes.py * rm test_raise_* from test_4bit.py * add test_4bit_serialization.py * modeling_utils bulk edits * bnb_ver 0.41.3 in integrations/bitsandbytes.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * @slow reinstated Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * bnb ver 0.41.3 in src/transformers/modeling_utils.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * rm bnb version todo in integrations/bitsandbytes.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * moved 4b serialization tests to test_4bit * tests upd for opt * to torch_device Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * ruff fixes to tests * rm redundant bnb version check in mod_utils Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * restore _hf_peft_config_loaded modeling_utils.py::2188 Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * restore _hf_peft_config_loaded test in modeling_utils.py::2199 Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * fixed NOT getattr(self, "is_8bit_serializable") Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * setting model.is_4bit_serializable * rm separate fp16_statistics arg from set_module... * rm else branch in integrations::bnb::set_module * bnb 4bit dtype check * upd comment on 4bit weights * upd tests for FP4 safe --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
- 02 Nov, 2023 1 commit
-
-
Younes Belkada authored
* fix for 8bit serialization * added regression tests. * fixup
-
- 31 Oct, 2023 1 commit
-
-
Younes Belkada authored
fix bnb mpt test
-
- 30 Oct, 2023 1 commit
-
-
Younes Belkada authored
* fix bnb test * link to GH issue
-
- 16 Oct, 2023 1 commit
-
-
Younes Belkada authored
* First step * fix * add adjustements for gptq * change to `_pre_quantization_dtype` * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix serialization * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 03 Oct, 2023 1 commit
-
-
Younes Belkada authored
* fix issues with PEFT * logger warning futurewarning issues * fixup * adapt from suggestions * oops * rm test
-
- 02 Oct, 2023 1 commit
-
-
Younes Belkada authored
* fix bnb test with code revision * fix test * Apply suggestions from code review * Update src/transformers/models/auto/auto_factory.py * Update src/transformers/models/auto/auto_factory.py * Update src/transformers/models/auto/auto_factory.py
-
- 13 Sep, 2023 2 commits
-
-
Younes Belkada authored
* Final fix RWMV 4bit * fixup * add a test * add more clarifications
-
Younes Belkada authored
* fix 4bit `num_parameters` * stronger check
-
- 25 Aug, 2023 1 commit
-
-
Younes Belkada authored
* move deepspeed to `lib_integrations.deepspeed` * more refactor * oops * fix slow tests * Fix docs * fix docs * addess feedback * address feedback * final modifs for PEFT * fixup * ok now * trigger CI * trigger CI again * Update docs/source/en/main_classes/deepspeed.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * import from `integrations` * address feedback * revert removal of `deepspeed` module * revert removal of `deepspeed` module * fix conflicts * ooops * oops * add deprecation warning * place it on the top * put `FutureWarning` * fix conflicts with not_doctested.txt * add back `bitsandbytes` module with a depr warning * fix * fix * fixup * oops * fix doctests --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 17 Aug, 2023 1 commit
-
-
Younes Belkada authored
fix un-rendered images
-
- 10 Aug, 2023 1 commit
-
-
Marc Sun authored
* GTPQ integration * Add tests for gptq * support for more quantization model * fix style * typo * fix method * Update src/transformers/modeling_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add dataclass and fix quantization_method * fix doc * Update tests/quantization/gptq/test_gptq.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * modify dataclass * add gtpqconfig import * fix typo * fix tests * remove dataset as req arg * remove tokenizer import * add offload cpu quantization test * fix check dataset * modify dockerfile * protect trainer * style * test for config * add more log * overwrite torch_dtype * draft doc * modify quantization_config docstring * fix class name in docstring * Apply suggestions from code review Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * more warning * fix 8bit kwargs tests * peft compatibility * remove var * fix is_gptq_quantized * remove is_gptq_quantized * fix wrap * Update src/transformers/modeling_utils.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add exllama * skip test * overwrite float16 * style * fix skip test * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix docsting formatting * add doc * better test --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-