- 01 Aug, 2024 2 commits
-
-
Sanchit Gandhi authored
-
Raushan Turganbay authored
cache class flag
-
- 31 Jul, 2024 9 commits
-
-
Ricardo authored
-
Sai-Suraj-27 authored
* Fixed staticmethods with self as first argument. * Fixed staticmethods with self as first argument. * Fixed staticmethods with self as first argument. * Fixed staticmethods with self as first argument.
-
fxmarty authored
* draft * apply changes to all relevant archs * rerun ci - check_docstrings.py failing? * fix docstring * move 2D->4D mask creation to modeling file * repo consistency * fix the batch size = 1 case - calling contiguous is not enough * nit * style * propagate to gemma/gemma-2 * prepare inputs for gemma generation * implement test and tiny fix in gemma2 * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix copies * ci pass * fix gemma's test_compile_static_cache tests * flacky * retrigger ci --------- Co-authored-by:
sanchit-gandhi <sanchit@huggingface.co> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Aymeric Roucher authored
Fix error when streaming agent run to gradio with non-string tool arguments
-
Joao Gante authored
-
amyeroberts authored
* Fix FA2 call for Perciever layer * [run_slow] idefics2 * [run_slow] idefics2 * [run_slow] idefics2 * Fix up * [run_slow] idefics2 * [run_slow] idefics2 * [run_slow] idefics2
-
Joao Gante authored
fix
💩 -
Raushan Turganbay authored
* enable flash-attn & static cache * this works, not the prev * fix for sliding window layers * not needed anymore
-
Raushan Turganbay authored
fix
-
- 30 Jul, 2024 12 commits
-
-
Joshua Lochner authored
* Remove user-defined tokens which can be obtained through merges * Remove debug line * formatting * Refactor spm slow -> fast converter * revert unnecessary refactor * set comprehension * remove test files * Use `vocab_scores` * Always replace spiece underline with space in decode * we no longer need token filtering * Add save fast load slow unit test * Remove tokenizers version check * Remove duplicate code * Make `<start_of_turn>` and `<end_of_turn>` special tokens * Bias merge priority with length if score is the same * Add unit test for merge priority * CI
-
Joao Gante authored
* tmp * skip files not in the diff * use git.Repo instead of an external subprocess * add tiny change to confirm that the diff is working on pushed changes * add make quality task * more profesh main commit reference
-
fkrasnov2 authored
fixes #32329 : The Torch code is correct - to get an average of 10% of the total, we want to take 50% of the remainder after we've already masked 80% with [MASK] in the previous step.
-
Wing Lian authored
fixes to properly shard FSDP across cpu and meta for cpu_efficient_loading for prequantized 4bit (#32276)
-
Sai-Suraj-27 authored
Fixed raising of few exceptions.
-
plaggy authored
* new agent plan * plan type assertion * style corrections * better prompt naming * make fixup
-
Joao Gante authored
* doc formatting nits * ignore non-autodocs * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/esm/modeling_esm.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/esm/modeling_esm.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Yoach Lacombe authored
* tentative fix * do the same for M4T
-
Luc Georges authored
-
Teddy Ferdinan authored
* fix epochs_trained as int when resuming training * refactor --------- Co-authored-by:teddyferdinan <teddy.ferdinan@pwr.edu.pl>
-
Isotr0py authored
* fix gguf dequantize for gguf==0.9.1 * fix old version * make style
-
Gilad Turok authored
Docs: fix GaLore optimizer example Fix incorrect usage of GaLore optimizer in Transformers trainer code example. The GaLore optimizer uses low-rank gradient updates to reduce memory usage. GaLore is quite popular and is implemented by the authors in [https://github.com/jiaweizzhao/GaLore](https://github.com/jiaweizzhao/GaLore). A few months ago GaLore was added to the HuggingFace Transformers library in https://github.com/huggingface/transformers/pull/29588. Documentation of the Trainer module includes a few code examples of how to use GaLore. However, the `optim_targe_modules` argument to the `TrainingArguments` function is incorrect, as discussed in https://github.com/huggingface/transformers/pull/29588#issuecomment-2006289512. This pull request fixes this issue.
-
- 29 Jul, 2024 13 commits
-
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Aymeric Roucher authored
* Add stream_to_gradio method for running agent in gradio demo
-
Guang Yang authored
-
Sanchit Gandhi authored
* [pipeline] fix padding for 1-d tensors * add test * make style * Update tests/pipelines/test_pipelines_automatic_speech_recognition.py Co-authored-by:
Kamil Akesbi <45195979+kamilakesbi@users.noreply.github.com> * Update tests/pipelines/test_pipelines_automatic_speech_recognition.py --------- Co-authored-by:
Kamil Akesbi <45195979+kamilakesbi@users.noreply.github.com>
-
Kamil Akesbi authored
* fix _fix_key in PreTrainedModel * fix _find_longest_common_sequence * add test * remove result.json * nit * update test
-
Joao Gante authored
* mvp * added test (a few models need fixes) * fix a few test cases * test nits * harder test
😈 * revert changes in stablelm * test with improved condition * add todo * tmp commit * merged with main * nits * add todo * final corrections * add docs for generation compilation * docs nits * add tip * PR suggestions * add more details to the compilation docs * fix cache positions * cache is now init in generate; update docs * tag test as flaky * docs * post rebase make fixup and other nits * remove unintended changes * whisper (encoder-decoder) not supported * move token default updates to ; add tests for token defaults * push changes * manual rebase * chameleon doesn't support this * fix test_static_cache_mha_mqa_gqa (broken in another PR) * docs: dynamic is better with end-to-end compilation -
Sai-Suraj-27 authored
Fixed a link in docs.
-
Fanli Lin authored
* fix * bug fix * refine * fix
-
Joao Gante authored
remove exceptions
-
Sai-Suraj-27 authored
Removed one wrong argument passed to convert_blip_checkpoint function call.
-
leejet authored
* Optimize t5 tokenize logic to avoid redundant calls * fix and overwrite copies
-
Yih-Dar authored
upload Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Raushan Turganbay authored
* bloom dynamic cache * bloom follows standard cache format * no skips for bloom anymore * use cache position when possible * clean up * codestyle * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * pr comments * isinstance fix * address comments * make musicgen test happy * [run-slow] bloom --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 27 Jul, 2024 1 commit
-
-
Joao Gante authored
* replace for loop by tensor ops * rm assert; readability
-
- 26 Jul, 2024 3 commits
-
-
Yih-Dar authored
update Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Raushan Turganbay authored
* fix * fix prev test (half of failures) * [run-slow] llama, gemma2 * [run-slow] llama, gemma2
-
Fanli Lin authored
[tests] fix `static` cache implementation is not compatible with `attn_implementation==flash_attention_2` (#32039) * add flash attention check * fix * fix
-