- 26 Apr, 2024 8 commits
-
-
Sanchit Gandhi authored
* [examples] update whisper fine-tuning * deprecate forced/suppress tokens * item assignment * update readme * final fix
-
amyeroberts authored
* Enable instantiating model with pretrained backbone weights * Clarify pretrained import * Use load_backbone instead * Add backbone_kwargs to config * Fix up * Add tests * Tidy up * Enable instantiating model with pretrained backbone weights * Update tests so backbone checkpoint isn't passed in * Clarify pretrained import * Update configs - docs and validation check * Update src/transformers/utils/backbone_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Clarify exception message * Update config init in tests * Add test for when use_timm_backbone=True * Use load_backbone instead * Add use_timm_backbone to the model configs * Add backbone_kwargs to config * Pass kwargs to constructors * Draft * Fix tests * Add back timm - weight naming * More tidying up * Whoops * Tidy up * Handle when kwargs are none * Update tests * Revert test changes * Deformable detr test - don't use default * Don't mutate; correct model attributes * Add some clarifying comments * nit - grammar is hard --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Zach Mueller authored
* Remove skipping logic now that set_epoch exists * Working version, clean
-
JB (Don) authored
* Adding SDPA support for BERT * Using the proper input name for testing model input in inference() * Adding documentation for SDPA in BERT model page * Use the stable link for the documentation * Adding a gate to only call .contiguous() for torch < 2.2.0 * Additions and fixes to the documentation * Minor updates to documentation * Adding extra requirements needed for the contiguous() bug * Adding "Adapted from" in plcae of the "Copied from" * Add benchmark speedup tables to the documentation * Minor fixes to the documentation * Use ClapText as a replacemenet for Bert in the Copied-From * Some more fixes for the fix-copies references * Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage [test all] * Undo changes to separate test * Refactored SDPA self attention code for KV projections * Change use_sdpa to attn_implementation * Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)
-
Matt authored
Use the Keras set_random_seed to ensure reproducible weight initialization
-
Michael Goin authored
* Update modeling_utils/dtype_byte_size to handle float8 types * Add a test for dtype_byte_size * Format * Fix bool
-
kyo authored
Fix the `bitsandbytes` error when some modules are not properly offloaded.
-
Younes Belkada authored
Update quantizer_eetq.py
-
- 25 Apr, 2024 18 commits
-
-
Aaron Jimenez authored
* add pipeline_webserver to es/ * add pipeline_webserver to es/, translate first section * add comment for checking link * translate pipeline_webserver * edit pipeline_webserver * fix typo
-
Younes Belkada authored
ensure popular quant methods are supported
-
Matt authored
* Draft tutorial for talking to chat models * Reformat lists and text snippets * Cleanups and clarifications * Finish up remaining TODOs * Correct section link * Small fix * Add proper quantization examples * Add proper quantization examples * Add proper quantization examples * Update docs/source/en/conversations.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix Text Generation Pipeline link and add a ref to the LLM inference guide * intelligent -> capable * Small intro cleanup * Small text cleanup * Small text cleanup * Clarification about system message * Clarification about system message --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Xuehai Pan authored
-
Raushan Turganbay authored
-
Zach Mueller authored
* Introduce saveable callbacks * Add note * Test for non-present and flag * Support early stopping and refusing to train further * Update docstring * More saving * Import oopsie * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Make it go through TrainerArguments * Document * Fix test * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Rework to allow for duplicates * CLean * Fix failing tests --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Zach Mueller authored
* Pin accelerate w/o eager * Eager * Update .circleci/create_circleci_config.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Expound * Expound squared * PyTorch -> dependency --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
manju rangam authored
* Fix issue #29817 Video Classification Task Guide Using Undeclared Variables * Update docs/source/en/tasks/video_classification.md updated with review comments Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix issue #29817 Add line space following PR comments --------- Co-authored-by:
manju-rangam <Manju1@Git> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Alexander Visheratin authored
* Added WSD scheduler. * Added tests. * Fixed errors. * Fix formatting. * CI fixes.
-
Yoach Lacombe authored
* first modeling code * make repository * still WIP * update model * add tests * add latest change * clean docstrings and copied from * update docstrings md and readme * correct chroma function * correct copied from and remove unreleated test * add doc to toctree * correct imports * add convert script to notdoctested * Add suggestion from Sanchit Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * correct get_uncoditional_inputs docstrings * modify README according to SANCHIT feedback * add chroma to audio utils * clean librosa and torchaudio hard dependencies * fix FE * refactor audio decoder -> audio encoder for consistency with previous musicgen * refactor conditional -> encoder * modify sampling rate logics * modify license at the beginning * refactor all_self_attns->all_attentions * remove ignore copy from causallm generate * add copied from for from_sub_models * fix make copies * add warning if audio is truncated * add copied from where relevant * remove artefact * fix convert script * fix torchaudio and FE * modify chroma method according to feedback-> better naming * refactor input_values->input_features * refactor input_values->input_features and fix import fe * add input_features to docstrigs * correct inputs_embeds logics * remove dtype conversion * refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation * change warning for chroma length * Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * change way to save wav, using soundfile * correct docs and change to soundfile * fix import * fix init proj layers * add draft training * fix cross entropy * clean loss computation * fix labels * remove line breaks from md * fix issue with docstrings * add FE suggestions * improve is in logics and remove useless imports * remove custom from_pretrained * simplify docstring code * add suggestions for modeling tests * make style * update converting script with sanity check * remove encoder attention mask from conditional generation * replace musicgen melody checkpoints with official orga * rename ylacombe->facebook in checkpoints * fix copies * remove unecessary warning * add shape in code docstrings * add files to slow doc tests * fix md bug and add md to not_tested * make fix-copies * fix hidden states test and batching * update training code * add training tests for melody * add training for o.g musicgen * fix copied from * remove final todos * make style * fix style * add suggestions from review * add ref to the original loss computation code * rename method + fix labels in tests * make style --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
Tom Aarsen authored
* Use EAFP principle to prevent crash with third parties * Remove leftover debugging code * Add info-level logger message
-
amyeroberts authored
-
amyeroberts authored
* Fix SigLip classification doctest * Remove extra line * Update src/transformers/models/siglip/modeling_siglip.py
-
amyeroberts authored
* Add utility for finding candidate models for deprecation * Better model filtering * Update * Add warning tip * Fix up * Review comments * Filter requests based on tags * Add copyright header
-
Arthur authored
* fix codellama conversion * nit
-
Younes Belkada authored
Update ssh-runner.yml
-
Younes Belkada authored
Update push-important-models.yml
-
Younes Belkada authored
* add SSH into our runners workflow * fix * fix * fix * use our previous approaches * forward contrib credits from discussions --------- Co-authored-by:Yih-Dar <ydshieh@users.noreply.github.com>
-
- 24 Apr, 2024 14 commits
-
-
Yih-Dar authored
* better names * run better names * update * update --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Zach Mueller authored
* Non blocking support * Check for optimization * Doc
-
Zach Mueller authored
* Check removing flag for torch * LLM oops * Getting there... * More discoveries * Change * Clean up and prettify * Logic check * Not
-
jeffhataws authored
save_safetensor=True is default as of release 4.35.0, which then required TPU hotfix https://github.com/huggingface/transformers/pull/27799 (issue https://github.com/huggingface/transformers/issues/27578). However, when the flag save_safetensor is set to False (compatibility mode), moving the model to CPU causes generation of too many graphs during checkpoint https://github.com/huggingface/transformers/issues/28438. This PR disable moving of model to CPU when save_safetensor=False.
-
Arthur authored
update most of decision transformers research project
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Gustavo de Rosa authored
* chore(root): Initial commit of Phi-3 files. * fix(root): Fixes Phi-3 missing on readme. * fix(root): Ensures files are consistent. * fix(phi3): Fixes unit tests. * fix(tests): Fixes style of phi-3 test file. * chore(tests): Adds integration tests for Phi-3. * fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and rmsnorm. * fix(phi3): Fixes incorrect docstrings. * fix(phi3): Fixes docstring typos. * fix(phi3): Adds support for Su and Yarn embeddings. * fix(phi3): Improves according first batch of reviews. * fix(phi3): Uses up_states instead of y in Phi3MLP. * fix(phi3): Uses gemma rotary embedding to support torch.compile. * fix(phi3): Improves how rotary embedding classes are defined. * fix(phi3): Fixes inv_freq not being re-computed for extended RoPE. * fix(phi3): Adds last suggestions to modeling file. * fix(phi3): Splits inv_freq calculation in two lines.
-
Yih-Dar authored
* trigger * remove the last job --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Eduardo Pacheco authored
* Fixed main train issues * Added loss test * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Added missing labels arg in SegGptModel forward * Fixed typo * Added slow test to test loss calculation --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Marc Sun authored
* fix jamba slow foward for multi-gpu * remove comm * oups * style
-
Anton Vlasjuk authored
* fix clip's/siglip's _init_weights to reflect linear layers in "for image classification" * trigger slow tests
-
Fanli Lin authored
* make device-agnostic * clean code
-
Arthur authored
* nit * nit and fmt skip * fixup * Update src/transformers/convert_slow_tokenizer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * set to true --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Pavel Iakubovskii authored
* Add test for square image that fails * Fix for square images * Extend test cases * Fix resizing in tests * Style fixup
-