- 13 Mar, 2024 17 commits
-
-
Yih-Dar authored
update Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Raushan Turganbay authored
* fix batchinng tests for new models * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Matt authored
* Refactor TFP call to just sigmoid() * Make sure we cast to the right dtype
-
Fanli Lin authored
[tests] make `test_trainer_log_level_replica` to run on accelerators with more than 2 devices (#29609) add new arg
-
amyeroberts authored
* Move normalization for numerical stability * Apply suggestions from code review Remove useless x=x line * PR comment - normalize later to preserve var name meaning
-
Sourab Mangrulkar authored
* fsdp+qlora related changes * fixes * Update quantization_config.py * support fsdp+qlora and dsz3+qlora * Update quantization_config.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * handle fsdp+qlora and dsz3+qlora correctly while model loading * fix param count * quality * fsdp related changes * fsdp changes only when using LoRA/QLoRA * add accelerate version check * refactor, update min accelerate version and add tests 1. Update minimum accelerate version to 0.26.0 2. Clean the trainer wrt accelerate version checks 3. FSDP refactor and test for fsdp config 4. use `itemsize` instead of `dtype2bytes` dict * fix test * Address comments Co-Authored-By:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * fix the conditional flag * fix conditional flag * address comments Co-Authored-By:
Zach Mueller <7831895+muellerzr@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Zach Mueller <7831895+muellerzr@users.noreply.github.com>
-
njackman-2344 authored
* torchscript and trainer md es translation * corrected md es files and even corrected spelling in en md * made es corrections to trainer.md * deleted entrenamiento... title on yml * placed entrenamiento in right place * translated es chat_templating.md w/ yml addition * requested es changes to md and yml * last es changes to md
-
Jiewen Tan authored
* tmp * Remove debug step * Fix a typo * Move to is_torch_xla_available
-
Joao Gante authored
-
amyeroberts authored
* Use einsum where possible * Fix
-
Dries Verachtert authored
-
Sanchit Gandhi authored
* [generate] deprecate forced ids processor * add todo * make message clearer
-
Lysandre Debut authored
* Adds pretrained IDs directly in the tests * Fix tests * Fix tests * Review!
-
Lysandre Debut authored
* Warn against remote tool use * Additional disclaimer * Update docs/source/en/custom_tools.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Sanchit Gandhi authored
deprecate old funcs
-
Younes Belkada authored
fix fix copies
-
bytebarde authored
* initial implementation of flash attention for gptj * modify flash attention and overwrite test_flash_attn_2_generate_padding_right * update flash attention support list * remove the copy line in the `CodeGenBlock` * address copy mechanism * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add GPTJ attention classes * add expected outputs in the gptj test * Ensure repo consistency with 'make fix-copies' --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 12 Mar, 2024 13 commits
-
-
Younes Belkada authored
* Update convert_gemma_weights_to_hf.py * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py * fixup
-
Joao Gante authored
check max_position_embeddings
-
Bharat Ramanathan authored
fix: handle logging of scalars in wandb summary fixes: #29430
-
Raushan Turganbay authored
* add tests for batching support * Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/test_modeling_common.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/test_modeling_common.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/test_modeling_common.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * fixes and comments * use cosine distance for conv models * skip mra model testing * Update tests/models/vilt/test_modeling_vilt.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * finzalize and make style * check model type by input names * Update tests/models/vilt/test_modeling_vilt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixed batch size for all testers * Revert "fixed batch size for all testers" This reverts commit 525f3a0a058f069fbda00352cf202b728d40df99. * add batch_size for all testers * dict from model output * do not skip layoutlm * bring back some code from git revert * Update tests/test_modeling_common.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/test_modeling_common.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * clean-up * where did minus go in tolerance * make whisper happy * deal with consequences of losing minus * deal with consequences of losing minus * maskformer needs its own test for happiness * fix more models * tag flaky CV models from Amy's approval * make codestyle --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Furkan Akkurt authored
Update quantization.md
-
Yih-Dar authored
* update * update * update --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Matt authored
* Set env var to hold Keras at Keras 2 * Add Amy's update * make fixup * Use a warning instead
-
Hilco van der Wilk authored
* Update legacy Repository usage in `examples/pytorch/text-classification/run_glue_no_trainer.py` Marked for deprecation here https://huggingface.co/docs/huggingface_hub/guides/upload#legacy-upload-files-with-git-lfs * Fix import order * Replace all example usage of deprecated Repository * Fix remaining repo call and rename args variable * Revert removing creation of gitignore files and don't change research examples
-
tomigee authored
Implemented add_pooling_layer argument
-
Kola authored
* Fix type (determine) * ruff * Update src/transformers/models/mamba/configuration_mamba.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Matt authored
* Fix examples to stop passing None to compile(), rework example invocation for run_text_classification.py * Add Amy's fix
-
Dries Verachtert authored
-
Raushan Turganbay authored
fix fuyu docs
-
- 11 Mar, 2024 10 commits
-
-
Pedro Cuenca authored
* Experimental loading of MLX files * Update exception message * Add test * Style * Use model from hf-internal-testing
-
fzyzcjy authored
* Update add_new_model.md * Update docs/source/en/add_new_model.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Amrit Gupta authored
Fixed broken link for Resources -> Token Classification -> Finetuning BERT for named-entity
-
Klaus Hipp authored
* Add missing localized READMEs to the copies check * Run check to resolve all inconsistencies
-
yuanzhoulvpi authored
fix error: TypeError: Object of type Tensor is not JSON serializable trainer Co-authored-by:Zach Mueller <muellerzr@gmail.com>
-
Yih-Dar authored
save ci life Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Klaus Hipp authored
[Docs] Fix FastSpeech2Conformer links
-
Yitong Huang authored
* add USE_TORCH_XLA env * rename torch_tpu to torch_xla * better is_torch_xla_available; fix some fsdp and performance issues * fix format * fix bug when pjrt_device is cpu * fix bug * fix the deprecation handling --------- Co-authored-by:
anw90 <ang868@gmail.com> Co-authored-by:
wangang.wa <wangang.wa@alibaba-inc.com>
-
Damith Senanayake authored
* Fixing error #29332. The _check_and_enable_flash_attn_2() method receives a check_device_map parameter and fails. * style fixup
-
Tanay Mehta authored
* add: initial script to train clm fim * fix: if training model from scratch, new tokens will be added and embeddings resized * fix: fixed attention_mask errors when generating FIM data * fix: file formatted using black * add: run_fim_no_trainer.py and fixed some comments in run_fim.py * add: added fim examples to the README.md and ran code fixup * fix: little bug in both fim training scripts * fix: remove comment from notebook and added a note on fim related params * fix: minor typo in README * add: suggested minor changes to README and run_fim.py * add: gradient_accumulation_steps and gradient_checkpointing args * add: improved model embedding resizing * add: pad_to_multiple_of and attn_implementation params * add: requested minor changes * add: deepspeed zero compatibility * add: resize embeddings layer with zero3 support for fim model initialization
-