- 26 Apr, 2024 1 commit
-
-
Michael Goin authored
* Update modeling_utils/dtype_byte_size to handle float8 types * Add a test for dtype_byte_size * Format * Fix bool
-
- 25 Apr, 2024 1 commit
-
-
Younes Belkada authored
ensure popular quant methods are supported
-
- 23 Apr, 2024 1 commit
-
-
Wing Lian authored
* fix for itemsize => element_size() for torch backwards compat * improve handling of element counting * Update src/transformers/modeling_utils.py * fixup * Update src/transformers/modeling_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Younes Belkada <younesbelkada@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 19 Apr, 2024 2 commits
-
-
hoshi-hiyouga authored
* Update modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py
-
Marc Sun authored
* Use unwrap with the one in accelerate * oups * update unwrap * fix * wording * raise error instead * comment * doc * Update src/transformers/modeling_utils.py Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * style * put else --------- Co-authored-by:
Zach Mueller <muellerzr@gmail.com>
-
- 12 Apr, 2024 1 commit
-
-
Sai-Suraj-27 authored
* Fixed deprecated logger.warn by using logger.warning * Reformatted using ruff.
-
- 10 Apr, 2024 1 commit
-
-
Younes Belkada authored
* fix torch compatiblity issues * fix * Update src/transformers/modeling_utils.py
-
- 09 Apr, 2024 1 commit
-
-
Sourab Mangrulkar authored
* fix sequence length errors * fix label column name error for vit * fix the lm_head embedding!=linear layer mismatches for Seq2Seq models
-
- 02 Apr, 2024 1 commit
-
-
Nicolas Patry authored
* Hard error when ignoring tensors. (#27484) * [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add small tests. * Dead variable. * Fixup. * Fixing tied_Weights_keys on generic models. * Fixup + T5 encoder/decoder tying (with different layers) * Code quality. * Dynamic member. * trigger * Fixing encoder name for other types of encoder/decoder combos. * Fix scoping. * Update .github/workflows/self-scheduled.yml Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fixing the tied_weights after the call. --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
- 27 Mar, 2024 1 commit
-
-
Lysandre Debut authored
* Automatic safetensors conversion when lacking these files (#29390) * Automatic safetensors conversion when lacking these files * Remove debug * Thread name * Typo * Ensure that raises do not affect the main thread * Catch all errors
-
- 25 Mar, 2024 2 commits
-
-
Arthur Zucker authored
-
Arthur Zucker authored
-
- 18 Mar, 2024 1 commit
-
-
Younes Belkada authored
* make `unexpected_keys` optional * push * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 15 Mar, 2024 1 commit
-
-
Marc Sun authored
* start integration * fix * add and debug tests * update tests * make pytorch serialization works * compatible with device_map and offload * fix tests * make style * add ref * guard against safetensors * add float8 and style * fix is_serializable * Fix shard_checkpoint compatibility with quanto * more tests * docs * adjust memory * better * style * pass tests * Update src/transformers/modeling_utils.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add is_safe_serialization instead * Update src/transformers/quantizers/quantizer_quanto.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add QbitsTensor tests * fix tests * simplify activation list * Update docs/source/en/quantization.md Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> * better comment * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> * find and fix edge case * Update docs/source/en/quantization.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * pass weights_only_kwarg instead * fix shard_checkpoint loading * simplify update_missing_keys * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * recursion to get all tensors * block serialization * skip serialization tests * fix * change by cuda:0 for now * fix regression * update device_map * fix doc * add noteboon * update torch_dtype * update doc * typo * typo * remove comm --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Younes Belkada <younesbelkada@gmail.com>
-
- 13 Mar, 2024 2 commits
-
-
Sourab Mangrulkar authored
* fsdp+qlora related changes * fixes * Update quantization_config.py * support fsdp+qlora and dsz3+qlora * Update quantization_config.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * handle fsdp+qlora and dsz3+qlora correctly while model loading * fix param count * quality * fsdp related changes * fsdp changes only when using LoRA/QLoRA * add accelerate version check * refactor, update min accelerate version and add tests 1. Update minimum accelerate version to 0.26.0 2. Clean the trainer wrt accelerate version checks 3. FSDP refactor and test for fsdp config 4. use `itemsize` instead of `dtype2bytes` dict * fix test * Address comments Co-Authored-By:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * fix the conditional flag * fix conditional flag * address comments Co-Authored-By:
Zach Mueller <7831895+muellerzr@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Zach Mueller <7831895+muellerzr@users.noreply.github.com>
-
Jiewen Tan authored
* tmp * Remove debug step * Fix a typo * Move to is_torch_xla_available
-
- 11 Mar, 2024 2 commits
-
-
Pedro Cuenca authored
* Experimental loading of MLX files * Update exception message * Add test * Style * Use model from hf-internal-testing
-
Yitong Huang authored
* add USE_TORCH_XLA env * rename torch_tpu to torch_xla * better is_torch_xla_available; fix some fsdp and performance issues * fix format * fix bug when pjrt_device is cpu * fix bug * fix the deprecation handling --------- Co-authored-by:
anw90 <ang868@gmail.com> Co-authored-by:
wangang.wa <wangang.wa@alibaba-inc.com>
-
- 07 Mar, 2024 2 commits
-
-
Alex Ishida authored
Add support for loading safetensors files saved with metadata format mlx.
-
Lysandre Debut authored
Revert "Automatic safetensors conversion when lacking these files (#29390)" This reverts commit a69cbf4e.
-
- 06 Mar, 2024 1 commit
-
-
Fanli Lin authored
* use require_torch_gpu * enable on XPU * fix
-
- 05 Mar, 2024 1 commit
-
-
Lysandre Debut authored
* Automatic safetensors conversion when lacking these files * Remove debug * Thread name * Typo * Ensure that raises do not affect the main thread
-
- 01 Mar, 2024 1 commit
-
-
Song Fuchang authored
Expose `offload_buffers` parameter of `accelerate` to `PreTrainedModel.from_pretrained` method (#28755) Expose offload_buffers parameter to from_pretrained method
-
- 27 Feb, 2024 1 commit
-
-
fxmarty authored
fix
-
- 20 Feb, 2024 1 commit
-
-
Arthur authored
* default to use it * style
-
- 16 Feb, 2024 1 commit
-
-
Lysandre Debut authored
* Script & Manual edition * Update
-
- 15 Feb, 2024 1 commit
-
-
Younes Belkada authored
Update modeling_utils.py
-
- 14 Feb, 2024 1 commit
-
-
Younes Belkada authored
* enhance trainer + not support quant methods * remove all old logic * add version
-
- 12 Feb, 2024 1 commit
-
-
JB (Don) authored
Continue to initialize tied output_embeddings if it has a bias term The bias term is not tied, and so will need to be initialized accordingly.
-
- 06 Feb, 2024 1 commit
-
- 05 Feb, 2024 1 commit
-
-
Nicolas Patry authored
* [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 02 Feb, 2024 2 commits
-
-
Juri Ganitkevitch authored
* Add missing None check for hf_quantizer * Add test, fix logic. * make style * Switch test model to Mistral * Comment * Update tests/test_modeling_utils.py --------- Co-authored-by:Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Klaus Hipp authored
* Fix typos and grammar mistakes in docs and examples * Fix typos in docstrings and comments * Fix spelling of `tokenizer` in model tests * Remove erroneous spaces in decorators * Remove extra spaces in Markdown link texts
-
- 31 Jan, 2024 1 commit
-
-
tom-p-reichel authored
* test that tied output embeddings aren't initialized on load * don't initialize the output embeddings if we're going to tie them to the input embeddings
-
- 30 Jan, 2024 1 commit
-
-
Poedator authored
* squashed earlier commits for easier rebase * rm rebase leftovers * 4bit save enabled @quantizers * TMP gptq test use exllama * fix AwqConfigTest::test_wrong_backend for A100 * quantizers AWQ fixes * _load_pretrained_model low_cpu_mem_usage branch * quantizers style * remove require_low_cpu_mem_usage attr * rm dtype arg from process_model_before_weight_loading * rm config_origin from Q-config * rm inspect from q_config * fixed docstrings in QuantizationConfigParser * logger.warning fix * mv is_loaded_in_4(8)bit to BnbHFQuantizer * is_accelerate_available error msg fix in quantizer * split is_model_trainable in bnb quantizer class * rm llm_int8_skip_modules as separate var in Q * Q rm todo * fwd ref to HFQuantizer in type hint * rm note re optimum.gptq.GPTQQuantizer * quantization_config in __init__ simplified * replaced NonImplemented with create_quantized_param * rm load_in_4/8_bit deprecation warning * QuantizationConfigParser refactoring * awq-related minor changes * awq-related changes * awq config.modules_to_not_convert * raise error if no q-method in q-config in args * minor cleanup * awq quantizer docstring * combine common parts in bnb process_model_before_weight_loading * revert test_gptq * .process_model_ cleanup * restore dict config warning * removed typevars in quantizers.py * cleanup post-rebase 16 jan * QuantizationConfigParser classmethod refactor * rework of handling of unexpected aux elements of bnb weights * moved q-related stuff from save_pretrained to quantizers * refactor v1 * more changes * fix some tests * remove it from main init * ooops * Apply suggestions from code review Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix awq issues * fix * fix * fix * fix * fix * fix * add docs * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/hf_quantizer.md * address comments * fix * fixup * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * address final comment * update * Update src/transformers/quantizers/base.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * add kwargs update * fixup * add `optimum_quantizer` attribute * oops * rm unneeded file * fix doctests --------- Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 26 Jan, 2024 4 commits
-
-
Scruel Tao authored
* fix: suppress `GatedRepoError` to use cache file (fix #28558). * move condition_to_return parameter back to outside.
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Turetskii Mikhail authored
-
fxmarty authored
* fix duplicate & unnecessary flash warnings * trigger ci * warning_once * if/else order --------- Co-authored-by:Your Name <you@example.com>
-
- 18 Jan, 2024 1 commit
-
-
Yih-Dar authored
* fix * fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-