- 16 Jul, 2024 1 commit
-
-
Zach Mueller authored
* 1,100%! * Clean * Don't touch DS * Experiment with dtype allocation * skip test_load_save_without_tied_weights test * A little faster * Include proper upscaling? * Fixup tests * Potentially skip? * Let's see if this fixes git history * Maintain new dtype * Fin * Rm hook idea for now * New approach, see what breaks * stage * Clean * Stash * Should be fin now, just need to mark failing models * Clean up * Simplify * Deal with weird models * Enc/Dec * Skip w/ reason * Adjust test * Fix test * one more test * Keep experimenting * Fix ref * TO REMOVE: testing feedback CI * Right push * Update tests/utils/test_modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * disable * Add new func * Test nits from Amy * Update src/transformers/modeling_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Adjust comment * Adjust comment on skip * make private * Fin * Should be a not flag * Clarify and rename test --------- Co-authored-by:
Marc Sun <marc@huggingface.co> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 11 Jul, 2024 1 commit
-
-
Omar Salman authored
* Add warning message for and parameters * Fix when the warning is raised * Formatting changes * Improve testing and remove duplicated warning from _fix_key
-
- 09 Jul, 2024 3 commits
-
-
Mauricio Villegas authored
Update modeling_utils.py Add return type annotation to PreTrainedModel.from_pretrained
-
kallewoof authored
-
Raushan Turganbay authored
* deprrecate `vocab_size` in other two VLMs * Update src/transformers/models/fuyu/configuration_fuyu.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * depracate until 4.44 --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 05 Jul, 2024 1 commit
-
-
Marc Sun authored
* Fix serialization * style * add test
-
- 27 Jun, 2024 1 commit
-
-
Billy Cao authored
* Allow dtype str for torch_dtype in from_pretrained * Update docstring * Add tests for str torch_dtype
-
- 19 Jun, 2024 1 commit
-
-
Joao Gante authored
-
- 12 Jun, 2024 2 commits
- 07 Jun, 2024 1 commit
-
-
Benjamin Badger authored
* added hidden subset * debugged hidden subset contrastive search * added contrastive search compression * debugged compressed contrastive search * memory reduction for contrastive search * debugged mem red * added low memory option feature * debugged mem optmimization output stack * debugged mem optmimization output stack * debugged low mem * added low mem cache * fixed 2047 tensor view * debugged 2042 past key val inputs * reformatted tensors * changed low mem output * final clean * removed subset hidden csearch * fixed hidden device * fixed hidden device * changed compressor dtype * removed hstate compression * integrated csearch in generate * test csearch integration into generation exit() * fixed csearch kwarg integration with generation * final wrap and added doc * Update src/transformers/generation/utils.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/utils.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/utils.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * added debug print * direct hstate cat * direct hstate cat * direct hstate cat debug * direct hstate cat debug * expanded full hidden state stack * expanded full hidden state stack * matched dims for hstates * matched dims for hstates * logits fix * equality test * equality hidden debug * debug * added prints for debug * added prints for debug * equality check * switched squeeze dim * input format debug * tracing top_k_ids * removed trace * added test context * added jitter * added jitter * added jitter * returned state * rebuilt past key value reconstruction * debugged * cleaned traces * added selection for pkv * changed output to dict * cleaned * cleaned * cleaned up contrastive search test * moved low_memory kwarg * debugged * changed low mem test batch size to 1 * removed output * debugged test input shape * reformatted csearch test * added trace * removed unsqueeze on final forward pass * replaced unsqueeze with view * removed traces * cleaned * debugged model kwargs * removed special models from test * ran make quality * Update src/transformers/generation/configuration_utils.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/configuration_utils.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * refactored * refactored * refactored * make fixup * renamed flag sequential * renamed flag sequential * iterative onloading * black style and test utils * added traces for integrated test * debugged * added traces * make style * removed traces, make style * included suggestions and added test * debugged test * added offload module check and make style * is_accelerate_available and make style * added test decorator * changed test model and config spec * added offload condition * added lazy loading for each shard * debugged * modified sharding * debugged * added traces * removed safe serialization * no index overload; * trace on safe save ptrs * added ptr condition * debugged * debugged ptr * moved module map init * remake shard only for offloaded modules * refactored * debugged * refactored * debugged * cleaned and make style * cleaned and make style * added trace * sparse module map * debugged * removed module map conditional * refactored * debug * debugged * added traces * added shard mem trace * added shard mem trace * removed underlying storage check * refactored * memory leak removal and make style * cleaned * swapped test decs and make style * added mem checks and make style * added free mem warning * implemented some suggestions * moved onloading to accelerate * refactored for accelerate integration * cleaned test * make style * debugged offload map name * cleaned and make style * replaced meta device check for sharding * cleaned and make style * implemented some suggestions * more suggestions * update warning Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * more suggestions * make style * new make style * Update src/transformers/modeling_utils.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 29 May, 2024 1 commit
-
-
Lucain authored
* Fix has_file in offline mode * harmonize env variable for offline mode * Switch to HF_HUB_OFFLINE * fix test * revert test_offline to test TRANSFORMERS_OFFLINE * Add new offline test * merge conflicts * docs
-
- 28 May, 2024 2 commits
-
-
Younes Belkada authored
add accelerate
-
oOraph authored
* Unit test to verify fix Signed-off-by:
Raphael Glon <oOraph@users.noreply.github.com> * fix from_pretrained in offline mode when model is preloaded in cache Signed-off-by:
Raphael Glon <oOraph@users.noreply.github.com> * minor: fmt Signed-off-by:
Raphael Glon <oOraph@users.noreply.github.com> --------- Signed-off-by:
Raphael Glon <oOraph@users.noreply.github.com> Co-authored-by:
Raphael Glon <oOraph@users.noreply.github.com>
-
- 24 May, 2024 1 commit
-
-
Lucain authored
-
- 23 May, 2024 1 commit
-
-
Raushan Turganbay authored
* clean-up * Update src/transformers/cache_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/cache_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/cache_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/generation/configuration_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * more suggestions * mapping if torch available * run tests & add 'support_quantized' flag * fix jamba test * revert, will be fixed by another PR * codestyle * HQQ and versatile cache classes * final update * typo * make tests happy --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
- 16 May, 2024 2 commits
-
-
Mohit Sharma authored
* disable fa * disable fa * update warning * update warning
-
Joao Gante authored
* jamba cache * new flag * generate exception
-
- 15 May, 2024 2 commits
-
-
Younes Belkada authored
* add method * change method name * more comments * Apply suggestions from code review Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixup * add docstrings and fix comment * warn users on the de-quantized dtype * Update src/transformers/quantizers/base.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/integrations/bitsandbytes.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * final suggestion - use private method --------- Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Lysandre Debut authored
* Adds support for loading GGUF files Co-authored-by:
Younes Belkada <younesbelkada@gmail.com> Co-authored-by:
99991 <99991@users.noreply.github.com> * add q2_k q3_k q5_k support from @99991 * fix tests * Update doc * Style * Docs * fix CI * Update docs/source/en/gguf.md * Update docs/source/en/gguf.md * Compute merges * change logic * add comment for clarity * add comment for clarity * Update src/transformers/models/auto/tokenization_auto.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * change logic * Update src/transformers/modeling_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * change * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/modeling_gguf_pytorch_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * put back comment * add comment about mistral * comments and added tests * fix unconsistent type * more * fix tokenizer * Update src/transformers/modeling_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * address comments about tests and tokenizer + add added_tokens * from_gguf -> gguf_file * replace on docs too --------- Co-authored-by:
Younes Belkada <younesbelkada@gmail.com> Co-authored-by:
99991 <99991@users.noreply.github.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 07 May, 2024 1 commit
-
-
David Xue authored
* add safetensors to model not found error for default use_safetensors=None case * format code w/ ruff * fix assert true typo
-
- 06 May, 2024 1 commit
-
-
Lucain authored
* Deprecate resume_download * remove default resume_download value --------- Co-authored-by:Lysandre Debut <hi@lysand.re>
-
- 02 May, 2024 2 commits
-
-
mobicham authored
* update HQQ transformers integration * push import_utils.py * add force_hooks check in modeling_utils.py * fix | with Optional * force bias as param * check bias is Tensor * force forward for multi-gpu * review fixes pass * remove torch grad() * if any key in linear_tags fix * add cpu/disk check * isinstance return * add multigpu test + refactor tests * clean hqq_utils imports in hqq.py * clean hqq_utils imports in quantizer_hqq.py * delete hqq_utils.py * Delete src/transformers/utils/hqq_utils.py * ruff init * remove torch.float16 from __init__ in test * refactor test * isinstance -> type in quantizer_hqq.py * cpu/disk device_map check in quantizer_hqq.py * remove type(module) nn.linear check in quantizer_hqq.py * add BaseQuantizeConfig import inside HqqConfig init * remove hqq import in hqq.py * remove accelerate import from test_hqq.py * quant config.py doc update * add hqqconfig to main_classes doc * make style * __init__ fix * ruff __init__ * skip_modules list * hqqconfig format fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * test_hqq.py remove mistral comment * remove self.using_multi_gpu is False * torch_dtype default val set and logger.info * hqq.py isinstance fix * remove torch=None * torch_device test_hqq * rename test_hqq * MODEL_ID in test_hqq * quantizer_hqq setattr fix * quantizer_hqq typo fix * imports quantizer_hqq.py * isinstance quantizer_hqq * hqq_layer.bias reformat quantizer_hqq * Step 2 as comment in quantizer_hqq * prepare_for_hqq_linear() comment * keep_in_fp32_modules fix * HqqHfQuantizer reformat * quantization.md hqqconfig * quantization.md model example reformat * quantization.md # space * quantization.md space }) * quantization.md space }) * quantization_config fix doc Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * axis value check in quantization_config * format * dynamic config explanation * quant config method in quantization.md * remove shard-level progress * .cuda fix modeling_utils * test_hqq fixes * make fix-copies --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Raushan Turganbay authored
* failiing CI * no let's keep it intil full deprecation in v4.42
-
- 26 Apr, 2024 1 commit
-
-
Michael Goin authored
* Update modeling_utils/dtype_byte_size to handle float8 types * Add a test for dtype_byte_size * Format * Fix bool
-
- 25 Apr, 2024 1 commit
-
-
Younes Belkada authored
ensure popular quant methods are supported
-
- 23 Apr, 2024 1 commit
-
-
Wing Lian authored
* fix for itemsize => element_size() for torch backwards compat * improve handling of element counting * Update src/transformers/modeling_utils.py * fixup * Update src/transformers/modeling_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Younes Belkada <younesbelkada@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 19 Apr, 2024 2 commits
-
-
hoshi-hiyouga authored
* Update modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py
-
Marc Sun authored
* Use unwrap with the one in accelerate * oups * update unwrap * fix * wording * raise error instead * comment * doc * Update src/transformers/modeling_utils.py Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * style * put else --------- Co-authored-by:
Zach Mueller <muellerzr@gmail.com>
-
- 12 Apr, 2024 1 commit
-
-
Sai-Suraj-27 authored
* Fixed deprecated logger.warn by using logger.warning * Reformatted using ruff.
-
- 10 Apr, 2024 1 commit
-
-
Younes Belkada authored
* fix torch compatiblity issues * fix * Update src/transformers/modeling_utils.py
-
- 09 Apr, 2024 1 commit
-
-
Sourab Mangrulkar authored
* fix sequence length errors * fix label column name error for vit * fix the lm_head embedding!=linear layer mismatches for Seq2Seq models
-
- 02 Apr, 2024 1 commit
-
-
Nicolas Patry authored
* Hard error when ignoring tensors. (#27484) * [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add small tests. * Dead variable. * Fixup. * Fixing tied_Weights_keys on generic models. * Fixup + T5 encoder/decoder tying (with different layers) * Code quality. * Dynamic member. * trigger * Fixing encoder name for other types of encoder/decoder combos. * Fix scoping. * Update .github/workflows/self-scheduled.yml Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fixing the tied_weights after the call. --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
- 27 Mar, 2024 1 commit
-
-
Lysandre Debut authored
* Automatic safetensors conversion when lacking these files (#29390) * Automatic safetensors conversion when lacking these files * Remove debug * Thread name * Typo * Ensure that raises do not affect the main thread * Catch all errors
-
- 25 Mar, 2024 2 commits
-
-
Arthur Zucker authored
-
Arthur Zucker authored
-
- 18 Mar, 2024 1 commit
-
-
Younes Belkada authored
* make `unexpected_keys` optional * push * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 15 Mar, 2024 1 commit
-
-
Marc Sun authored
* start integration * fix * add and debug tests * update tests * make pytorch serialization works * compatible with device_map and offload * fix tests * make style * add ref * guard against safetensors * add float8 and style * fix is_serializable * Fix shard_checkpoint compatibility with quanto * more tests * docs * adjust memory * better * style * pass tests * Update src/transformers/modeling_utils.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add is_safe_serialization instead * Update src/transformers/quantizers/quantizer_quanto.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add QbitsTensor tests * fix tests * simplify activation list * Update docs/source/en/quantization.md Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> * better comment * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> * find and fix edge case * Update docs/source/en/quantization.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * pass weights_only_kwarg instead * fix shard_checkpoint loading * simplify update_missing_keys * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * recursion to get all tensors * block serialization * skip serialization tests * fix * change by cuda:0 for now * fix regression * update device_map * fix doc * add noteboon * update torch_dtype * update doc * typo * typo * remove comm --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Younes Belkada <younesbelkada@gmail.com>
-
- 13 Mar, 2024 2 commits
-
-
Sourab Mangrulkar authored
* fsdp+qlora related changes * fixes * Update quantization_config.py * support fsdp+qlora and dsz3+qlora * Update quantization_config.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * handle fsdp+qlora and dsz3+qlora correctly while model loading * fix param count * quality * fsdp related changes * fsdp changes only when using LoRA/QLoRA * add accelerate version check * refactor, update min accelerate version and add tests 1. Update minimum accelerate version to 0.26.0 2. Clean the trainer wrt accelerate version checks 3. FSDP refactor and test for fsdp config 4. use `itemsize` instead of `dtype2bytes` dict * fix test * Address comments Co-Authored-By:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * fix the conditional flag * fix conditional flag * address comments Co-Authored-By:
Zach Mueller <7831895+muellerzr@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Zach Mueller <7831895+muellerzr@users.noreply.github.com>
-
Jiewen Tan authored
* tmp * Remove debug step * Fix a typo * Move to is_torch_xla_available
-