- 03 Aug, 2024 1 commit
-
-
Shaopeng Fu authored
fix: (issue #32124) Exception raised when running `transformers/examples/flax/language-modeling/t5_tokenizer_model.py`. (#32157) fix: Exception raised when running .
-
- 02 Aug, 2024 3 commits
-
-
Sanchit Gandhi authored
* up * style * stopping
-
Joao Gante authored
tests! :D
-
Raushan Turganbay authored
nits
-
- 01 Aug, 2024 13 commits
-
-
Zach Mueller authored
* Test this zach * Test for improper init w/o zero3 * Move back * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Get rid of stars in warning * Make private * Make clear --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
OsamaS99 authored
* fixed hybrid cache init, added test * Fix Test Typo --------- Co-authored-by:Aaron Haag <aaron.haag@siemens.com>
-
Joao Gante authored
-
Nikos Karampatziakis authored
* Initial implementation of OffloadedCache * enable usage via cache_implementation * Address feedback, add tests, remove legacy methods. * Remove flash-attn, discover synchronization bugs, fix bugs * Prevent usage in CPU only mode * Add a section about offloaded KV cache to the docs * Fix typos in docs * Clarifications and better explanation of streams
-
Omar Salman authored
* Fix conflicting key in init kwargs in PreTrainedTokenizerBase * Update code to check for callable key in save_pretrained * Apply PR suggestions * Invoke CI * Updates based on PR suggestion
-
Viktor Scherbakov authored
empty list in defaults
-
Ita Zaporozhets authored
-
Hanna Yukhymenko authored
* Remove TPU device map for saving tokenizer config * Update tokenization_utils_base.py * Fix error msg when passing non-string device into tokenizer * Fix error message for non-string tokenizer device * Print out tokenizer device type in error msg * Update tokenization_utils_base.py
-
nv-guomingz authored
Co-authored-by:Guoming Zhang <37257613+nv-guomingz@users.noreply.github.com>
-
Lunwen He authored
* Remove size check between attn_weights and kv_seq_len * add unit tests
-
Sanchit Gandhi authored
* [whisper] compile compatibility with long-form decoding * clarify comment * fix after rebase * finalise * fix bsz * fix cache split * remove contiguous * style * finish * update doc * prevent cuda graph trace
-
Sanchit Gandhi authored
-
Raushan Turganbay authored
cache class flag
-
- 31 Jul, 2024 9 commits
-
-
Ricardo authored
-
Sai-Suraj-27 authored
* Fixed staticmethods with self as first argument. * Fixed staticmethods with self as first argument. * Fixed staticmethods with self as first argument. * Fixed staticmethods with self as first argument.
-
fxmarty authored
* draft * apply changes to all relevant archs * rerun ci - check_docstrings.py failing? * fix docstring * move 2D->4D mask creation to modeling file * repo consistency * fix the batch size = 1 case - calling contiguous is not enough * nit * style * propagate to gemma/gemma-2 * prepare inputs for gemma generation * implement test and tiny fix in gemma2 * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix copies * ci pass * fix gemma's test_compile_static_cache tests * flacky * retrigger ci --------- Co-authored-by:
sanchit-gandhi <sanchit@huggingface.co> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Aymeric Roucher authored
Fix error when streaming agent run to gradio with non-string tool arguments
-
Joao Gante authored
-
amyeroberts authored
* Fix FA2 call for Perciever layer * [run_slow] idefics2 * [run_slow] idefics2 * [run_slow] idefics2 * Fix up * [run_slow] idefics2 * [run_slow] idefics2 * [run_slow] idefics2
-
Joao Gante authored
fix
💩 -
Raushan Turganbay authored
* enable flash-attn & static cache * this works, not the prev * fix for sliding window layers * not needed anymore
-
Raushan Turganbay authored
fix
-
- 30 Jul, 2024 12 commits
-
-
Joshua Lochner authored
* Remove user-defined tokens which can be obtained through merges * Remove debug line * formatting * Refactor spm slow -> fast converter * revert unnecessary refactor * set comprehension * remove test files * Use `vocab_scores` * Always replace spiece underline with space in decode * we no longer need token filtering * Add save fast load slow unit test * Remove tokenizers version check * Remove duplicate code * Make `<start_of_turn>` and `<end_of_turn>` special tokens * Bias merge priority with length if score is the same * Add unit test for merge priority * CI
-
Joao Gante authored
* tmp * skip files not in the diff * use git.Repo instead of an external subprocess * add tiny change to confirm that the diff is working on pushed changes * add make quality task * more profesh main commit reference
-
fkrasnov2 authored
fixes #32329 : The Torch code is correct - to get an average of 10% of the total, we want to take 50% of the remainder after we've already masked 80% with [MASK] in the previous step.
-
Wing Lian authored
fixes to properly shard FSDP across cpu and meta for cpu_efficient_loading for prequantized 4bit (#32276)
-
Sai-Suraj-27 authored
Fixed raising of few exceptions.
-
plaggy authored
* new agent plan * plan type assertion * style corrections * better prompt naming * make fixup
-
Joao Gante authored
* doc formatting nits * ignore non-autodocs * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/esm/modeling_esm.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/esm/modeling_esm.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Yoach Lacombe authored
* tentative fix * do the same for M4T
-
Luc Georges authored
-
Teddy Ferdinan authored
* fix epochs_trained as int when resuming training * refactor --------- Co-authored-by:teddyferdinan <teddy.ferdinan@pwr.edu.pl>
-
Isotr0py authored
* fix gguf dequantize for gguf==0.9.1 * fix old version * make style
-
Gilad Turok authored
Docs: fix GaLore optimizer example Fix incorrect usage of GaLore optimizer in Transformers trainer code example. The GaLore optimizer uses low-rank gradient updates to reduce memory usage. GaLore is quite popular and is implemented by the authors in [https://github.com/jiaweizzhao/GaLore](https://github.com/jiaweizzhao/GaLore). A few months ago GaLore was added to the HuggingFace Transformers library in https://github.com/huggingface/transformers/pull/29588. Documentation of the Trainer module includes a few code examples of how to use GaLore. However, the `optim_targe_modules` argument to the `TrainingArguments` function is incorrect, as discussed in https://github.com/huggingface/transformers/pull/29588#issuecomment-2006289512. This pull request fixes this issue.
-
- 29 Jul, 2024 2 commits
-
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Aymeric Roucher authored
* Add stream_to_gradio method for running agent in gradio demo
-