- 05 Jul, 2024 3 commits
-
-
Kazuaki Ishizaki authored
return correct device when ACCELERATE_TORCH_DEVICE is defined
-
Marc Sun authored
* Fix serialization * style * add test
-
mxkopy authored
* fixed ClapProcessor to merge all values output from the feature extractor into the returned BatchEncoding. * fixed trailing whitespace
-
- 04 Jul, 2024 3 commits
-
-
Billy Cao authored
* Add torch_empty_cache_steps to TrainingArguments * Fix formatting * Add torch_empty_cache_steps to docs on single gpu training * Remove check for torch_empty_cache_steps <= max_steps * Captalize Tip * Be device agnostic * Fix linting
-
hoshi-hiyouga authored
Update __init__.py
-
Yih-Dar authored
pytest_num_workers=4 Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 03 Jul, 2024 9 commits
-
-
Pavel Iakubovskii authored
* Fix init for rt-detr heads * Fixup * Add separate prior_prob value to config for initialization * Add bbox init * Change to 1 / num_labels init * Adjust weights init test * Fix style for test
-
Pavel Iakubovskii authored
* Fix cache and type conversion * Add test * Fixup * nit * [run slow] rt_detr * Fix test * Fixup * [run slow] rt_detr * Update src/transformers/models/rt_detr/modeling_rt_detr.py
-
Willard Sheen authored
* [fix BUG] pad labels before use it in preprocess_logits_for_metrics * a more readable fix labels can't use `gather` before pass to `preprocess_logits_for_metrics`, so must split into 2 if-block * add a comment * oh code quality check
-
Nate Brake authored
Update trainer.py
-
Joao Gante authored
gemma 2 slow tests
-
Pablo Montalvo authored
-
Aymeric Roucher authored
* Adds final answer tool for all agents * Typo * Add clarification in doc * Put final_answer tool adition in agent for clarity
-
Ella Charlaix authored
-
jiqing-feng authored
* fix assisted decoding * check None * fix typo * fix _prepare_special_tokens * fix style * fix lint * add tests for assisted decoding * fix style * fix tests check
-
- 02 Jul, 2024 7 commits
-
-
J枚rg Bornschein authored
* Fix documentation for Gemma2. Model sizes and Blog post URL are wrong in the documentation. * Update docs/source/en/model_doc/gemma2.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Matt authored
Make the order of array items consistent using sorted()
-
Joao Gante authored
* rely on the tokenizer default kwargs * fix a few tests
-
Sanchit Gandhi authored
* make work with cache abstraction * correct for static cache * hacks for compile * make fast * fix * fix pos ids * generate * fix sdpa * fix sdpa cache pos * fix fa2 * clean fa2 * integrate cache into generate * make style * copies * more copies * update eager * update sdpa * update fa2 * simplify * use cache pos * always compute cross-cache for debug * avoid recompiles Co-authored-by:
Arthur Zucker <arthur@huggingface.co> * fix fix * fix fix fix * more fix * try encoder-decoder cache (too messy) * revert encoder-decoder cache * check cross-attn cache * use enc-dec dataclass * use richer enc-dec dataclass * clean-up * revert static cache changes * small fixes * revert to cpu flag * fix copies * add static slow test * past k/v docstring * more docstrings * cache_position docstrings * add to docs * add enc-dec cache to docs * make style * fix after rebase * fix beam * style * fix generation strategies * fix most decoder-only tests * style * skip test * more clean up * small docstrings * Apply suggestions from code review Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * add todo * only crop self-attn * check cache in mixin * style * fix re-compile after rebase * move `is_updated` logic to enc-dec wrapper * revert back * revert cache back * finalise design * fix * fix fix * style * Update src/transformers/cache_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * deprecate * updates * final updates * style * style --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
fxmarty authored
* use bitwise or * why is the CI not triggered?
-
Yih-Dar authored
* move * move * move * move * Update tests/utils/test_image_processing_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Kriszti谩n Boros authored
* remove incorrect urls pointing to the llava repository * remove incorrect urls pointing to the llava repository; removing entire comments * remove incorrect urls pointing to the llava repository; removing entire comments; ran fix-copies * ran fixup
-
- 01 Jul, 2024 1 commit
-
-
Joao Gante authored
* keras nlp pin * this should use the new docker images:dev * dev-ci
-
- 28 Jun, 2024 6 commits
-
-
Jade Choghari authored
* Add French translation of run scripts tutorial * Update docs/source/fr/run_scripts_fr.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/run_scripts_fr.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/run_scripts_fr.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/run_scripts_fr.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/run_scripts_fr.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Jade Choghari <chogharijade@icloud.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Arthur authored
* softcapping * soft cap before the mask * style * ... * super nit
-
Sangbum Daniel Choi authored
* add gather_use_object arguments * fix name and pass the CI test for Seq2SeqTrainer * make style * make it to functools * fix typo * add accelerate version: * adding warning * Update src/transformers/trainer.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * make style * Update src/transformers/training_args.py * check function move to initial part * add test for eval_use_gather_object --------- Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-
Jacky Lee authored
* fix: use return_dict parameter * fix: type checks * fix: unused imports * update: one-line if else * remove: recursive check
-
hoshi-hiyouga authored
Update modeling_gemma2.py Co-authored-by:Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Wing Lian authored
* don't zero out the attention_mask when using sliding window with flash attention * chore: lint
-
- 27 Jun, 2024 11 commits
-
-
Sanchit Gandhi authored
* fix gemma2 * handle in generate
-
Steven Liu authored
quick usage to top
-
Billy Cao authored
-
Arthur authored
* nit * toctree issue * protect gemma2 tests as well * sdpa supported
-
Lysandre authored
-
Arthur authored
* inital commit * Add doc * protect? * fixup stuffs * update tests * fix build documentation * mmmmmmm config attributes * style * nit * uodate * nit * Fix docs * protect some stuff --------- Co-authored-by:Lysandre <lysandre@huggingface.co>
-
Raushan Turganbay authored
remove
-
Sangbum Daniel Choi authored
* change anchor_image_size None for compatibility * make fix-copies
-
Billy Cao authored
* Allow dtype str for torch_dtype in from_pretrained * Update docstring * Add tests for str torch_dtype
-
Arthur authored
* fix and simplify the script! * add co-author --------- Co-authored-by:crackalamoo <crackalamoo@users.noreply.github.com>
-
Merve Noyan authored
* fixed models * format with bumped ruff version on my local * fix copies * add tracing checks * format * Update src/transformers/utils/generic.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * format * style fix * Update modeling_mobilevit.py * add docstring and change name * Update __init__.py * Update __init__.py --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-