- 30 Jan, 2024 11 commits
-
-
Matt authored
* Pin torch to <2.2.0 * Pin torchvision and torchaudio as well * Playing around with versions to see if this helps * twiddle something to restart the CI * twiddle it back * Try changing the natten version * make fixup * Revert "Try changing the natten version" This reverts commit de0d6592c35dc39ae8b5a616c27285db28262d06. * make fixup * fix fix fix * fix fix fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Matt authored
* Port core files + ESM (because ESM code is odd) * Search-replace in modelling code * Fix up transfo_xl as well * Fix other core files + tests (still need to add correct import to tests) * Fix cookiecutter * make fixup, fix imports in some more core files * Auto-add imports to tests * Cleanup, add imports to sagemaker tests * Use correct exception for importing tf_keras * Fixes in modeling_tf_utils * make fixup * Correct version parsing code * Ensure the pipeline tests correctly revert to float32 after each test * Ensure the pipeline tests correctly revert to float32 after each test * More tf.keras -> keras * Add dtype cast * Better imports of tf_keras * Add a cast for tf.assign, just in case * Fix callback imports
-
amyeroberts authored
* Abstract out pipeline init args * Address PR comments * Reword * BC PIPELINE_INIT_ARGS * Remove old arguments * Small fix
-
amyeroberts authored
* Enable instantiating model with pretrained backbone weights * Remove doc updates until changes made in modeling code * Use load_backbone instead * Add use_timm_backbone to the model configs * Add missing imports and arguments * Update docstrings * Make sure test is properly configured * Include recent DPT updates
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
fxmarty authored
guard sdpa on torch>=2.0
-
Thien Tran authored
* use conv for tdnn * run make fixup * update TDNN * add PEFT LoRA check * propagate tdnn warnings to others * add missing imports * update TDNN in wav2vec2_bert * add missing imports
-
Younes Belkada authored
Update _toctree.yml
-
Poedator authored
* squashed earlier commits for easier rebase * rm rebase leftovers * 4bit save enabled @quantizers * TMP gptq test use exllama * fix AwqConfigTest::test_wrong_backend for A100 * quantizers AWQ fixes * _load_pretrained_model low_cpu_mem_usage branch * quantizers style * remove require_low_cpu_mem_usage attr * rm dtype arg from process_model_before_weight_loading * rm config_origin from Q-config * rm inspect from q_config * fixed docstrings in QuantizationConfigParser * logger.warning fix * mv is_loaded_in_4(8)bit to BnbHFQuantizer * is_accelerate_available error msg fix in quantizer * split is_model_trainable in bnb quantizer class * rm llm_int8_skip_modules as separate var in Q * Q rm todo * fwd ref to HFQuantizer in type hint * rm note re optimum.gptq.GPTQQuantizer * quantization_config in __init__ simplified * replaced NonImplemented with create_quantized_param * rm load_in_4/8_bit deprecation warning * QuantizationConfigParser refactoring * awq-related minor changes * awq-related changes * awq config.modules_to_not_convert * raise error if no q-method in q-config in args * minor cleanup * awq quantizer docstring * combine common parts in bnb process_model_before_weight_loading * revert test_gptq * .process_model_ cleanup * restore dict config warning * removed typevars in quantizers.py * cleanup post-rebase 16 jan * QuantizationConfigParser classmethod refactor * rework of handling of unexpected aux elements of bnb weights * moved q-related stuff from save_pretrained to quantizers * refactor v1 * more changes * fix some tests * remove it from main init * ooops * Apply suggestions from code review Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix awq issues * fix * fix * fix * fix * fix * fix * add docs * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/hf_quantizer.md * address comments * fix * fixup * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * address final comment * update * Update src/transformers/quantizers/base.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * add kwargs update * fixup * add `optimum_quantizer` attribute * oops * rm unneeded file * fix doctests --------- Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Zhan Ling authored
Add _no_split_modules to CLIPModel
-
Omar Sanseviero authored
* Update quantization_config.py * Style * Protect from setting directly * add tests * Update tests/quantization/bnb/test_4bit.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
- 29 Jan, 2024 13 commits
-
-
ThibaultLengagne authored
* doc: french README Signed-off-by:
ThibaultLengagne <thibaultl@padok.fr> * doc: Add Depth Anything Signed-off-by:
ThibaultLengagne <thibaultl@padok.fr> * doc: Add french link in other docs Signed-off-by:
ThibaultLengagne <thibaultl@padok.fr> * doc: Add missing links in fr docs * doc: fix several mistakes in translation Signed-off-by:
ThibaultLengagne <thibaultl@padok.fr> --------- Signed-off-by:
ThibaultLengagne <thibaultl@padok.fr> Co-authored-by:
Sarapuce <alexandreh@padok.fr>
-
Ajay Patel authored
* Update trainer.py * Revert "Update trainer.py" This reverts commit 0557e2cc9effa3a41304322032239a3874b948a7. * Make trainer.py use adapter_only=True when using FSDP + PEFT * Support load_best_model with adapter_only=True * Ruff format * Inspect function args for save_ load_ fsdp utility functions and only pass adapter_only=True if they support it
-
Sanchit Gandhi authored
* [Whisper] Make tokenizer normalization public * add to docs
-
xkszltl authored
-
amyeroberts authored
* Make test_constrained_beam_search_generate as flaky * Update tests/generation/test_utils.py
-
amyeroberts authored
* Pin pytest version <8.0.0 * Update setup.py * make deps_table_update
-
Julien Chaumond authored
-
Nate Cibik authored
* Enabled gradient checkpointing in Deformable DETR * Enabled gradient checkpointing in Deformable DETR encoder * Removed # Copied from headers in modeling_deta.py to break dependence on Deformable DETR code
-
Wesley Gifford authored
*
🐛 fix .max bug * remove prediction_length from regression output dimensions * fix parameter names, fix output names, update tests * ensure shape for PatchTST * ensure output shape for PatchTSMixer * update model, batch, and expected for regression distribution test * update test expected Signed-off-by:Wesley M. Gifford <wmgifford@us.ibm.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * standardize on patch_length Signed-off-by:
Wesley M. Gifford <wmgifford@us.ibm.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Make arguments more explicit Signed-off-by:
Wesley M. Gifford <wmgifford@us.ibm.com> * adjust prepared inputs Signed-off-by:
Wesley M. Gifford <wmgifford@us.ibm.com> --------- Signed-off-by:
Wesley M. Gifford <wmgifford@us.ibm.com> Co-authored-by:
Wesley M. Gifford <wmgifford@us.ibm.com> Co-authored-by:
Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Vinyzu authored
* [Docs] Fix Typo in English CLIP model_doc * [Docs] Fix Typo in Japanese CLIP model_doc
-
Klaus Hipp authored
-
Yih-Dar authored
* fix * fix * Fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Angela Yi authored
* Add serialized type name to pytrees * Modify context * add serde test
-
- 28 Jan, 2024 1 commit
-
-
amyeroberts authored
[Siglip] protect from imports if sentencepiece not installed
-
- 27 Jan, 2024 2 commits
-
-
Joao Gante authored
-
Joao Gante authored
-
- 26 Jan, 2024 12 commits
-
-
Sanchit Gandhi authored
-
Steven Liu authored
* change datasets * fix
-
Yih-Dar authored
* try pydantic v2 * try pydantic v2 --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Scruel Tao authored
* fix: suppress `GatedRepoError` to use cache file (fix #28558). * move condition_to_return parameter back to outside.
-
Matt authored
* Stop confusing the TF compiler with ModelOutput objects * Stop confusing the TF compiler with ModelOutput objects
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Shukant Pal authored
Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled() to respect HF_HUB_DISABLE_PROGRESS_BARS It seems like enable_progress_bar() and disable_progress_bar() sync up with huggingface_hub, but the initial value is always True. This changes will make sure the user's preference is respected implicity on initialization.
-
D authored
* Update preprocessing.md adjust ImageProcessor link to working target (same as in lower section of file) * Update preprocessing.md
-
Turetskii Mikhail authored
-
Facico authored
* support PeftMixedModel signature inspect * import PeftMixedModel only peft>=0.7.0 * Update src/transformers/trainer.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * fix styling * Update src/transformers/trainer.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * style fixup * fix note --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
fxmarty authored
* fix duplicate & unnecessary flash warnings * trigger ci * warning_once * if/else order --------- Co-authored-by:Your Name <you@example.com>
-
Yih-Dar authored
* fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 25 Jan, 2024 1 commit
-
-
Peter Götz authored
The documentation says "We refer to this Model parallelism as “Vertical” because of how models are typically visualized.", but then visualizes the model horizontally. This change visualizes the model indeed vertically.
-