- 13 Mar, 2024 3 commits
-
-
Sanchit Gandhi authored
deprecate old funcs
-
Younes Belkada authored
fix fix copies
-
bytebarde authored
* initial implementation of flash attention for gptj * modify flash attention and overwrite test_flash_attn_2_generate_padding_right * update flash attention support list * remove the copy line in the `CodeGenBlock` * address copy mechanism * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add GPTJ attention classes * add expected outputs in the gptj test * Ensure repo consistency with 'make fix-copies' --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 12 Mar, 2024 13 commits
-
-
Younes Belkada authored
* Update convert_gemma_weights_to_hf.py * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py * fixup
-
Joao Gante authored
check max_position_embeddings
-
Bharat Ramanathan authored
fix: handle logging of scalars in wandb summary fixes: #29430
-
Raushan Turganbay authored
* add tests for batching support * Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/test_modeling_common.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/test_modeling_common.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/test_modeling_common.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * fixes and comments * use cosine distance for conv models * skip mra model testing * Update tests/models/vilt/test_modeling_vilt.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * finzalize and make style * check model type by input names * Update tests/models/vilt/test_modeling_vilt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixed batch size for all testers * Revert "fixed batch size for all testers" This reverts commit 525f3a0a058f069fbda00352cf202b728d40df99. * add batch_size for all testers * dict from model output * do not skip layoutlm * bring back some code from git revert * Update tests/test_modeling_common.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/test_modeling_common.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * clean-up * where did minus go in tolerance * make whisper happy * deal with consequences of losing minus * deal with consequences of losing minus * maskformer needs its own test for happiness * fix more models * tag flaky CV models from Amy's approval * make codestyle --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Furkan Akkurt authored
Update quantization.md
-
Yih-Dar authored
* update * update * update --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Matt authored
* Set env var to hold Keras at Keras 2 * Add Amy's update * make fixup * Use a warning instead
-
Hilco van der Wilk authored
* Update legacy Repository usage in `examples/pytorch/text-classification/run_glue_no_trainer.py` Marked for deprecation here https://huggingface.co/docs/huggingface_hub/guides/upload#legacy-upload-files-with-git-lfs * Fix import order * Replace all example usage of deprecated Repository * Fix remaining repo call and rename args variable * Revert removing creation of gitignore files and don't change research examples
-
tomigee authored
Implemented add_pooling_layer argument
-
Kola authored
* Fix type (determine) * ruff * Update src/transformers/models/mamba/configuration_mamba.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Matt authored
* Fix examples to stop passing None to compile(), rework example invocation for run_text_classification.py * Add Amy's fix
-
Dries Verachtert authored
-
Raushan Turganbay authored
fix fuyu docs
-
- 11 Mar, 2024 12 commits
-
-
Pedro Cuenca authored
* Experimental loading of MLX files * Update exception message * Add test * Style * Use model from hf-internal-testing
-
fzyzcjy authored
* Update add_new_model.md * Update docs/source/en/add_new_model.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Amrit Gupta authored
Fixed broken link for Resources -> Token Classification -> Finetuning BERT for named-entity
-
Klaus Hipp authored
* Add missing localized READMEs to the copies check * Run check to resolve all inconsistencies
-
yuanzhoulvpi authored
fix error: TypeError: Object of type Tensor is not JSON serializable trainer Co-authored-by:Zach Mueller <muellerzr@gmail.com>
-
Yih-Dar authored
save ci life Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Klaus Hipp authored
[Docs] Fix FastSpeech2Conformer links
-
Yitong Huang authored
* add USE_TORCH_XLA env * rename torch_tpu to torch_xla * better is_torch_xla_available; fix some fsdp and performance issues * fix format * fix bug when pjrt_device is cpu * fix bug * fix the deprecation handling --------- Co-authored-by:
anw90 <ang868@gmail.com> Co-authored-by:
wangang.wa <wangang.wa@alibaba-inc.com>
-
Damith Senanayake authored
* Fixing error #29332. The _check_and_enable_flash_attn_2() method receives a check_device_map parameter and fails. * style fixup
-
Tanay Mehta authored
* add: initial script to train clm fim * fix: if training model from scratch, new tokens will be added and embeddings resized * fix: fixed attention_mask errors when generating FIM data * fix: file formatted using black * add: run_fim_no_trainer.py and fixed some comments in run_fim.py * add: added fim examples to the README.md and ran code fixup * fix: little bug in both fim training scripts * fix: remove comment from notebook and added a note on fim related params * fix: minor typo in README * add: suggested minor changes to README and run_fim.py * add: gradient_accumulation_steps and gradient_checkpointing args * add: improved model embedding resizing * add: pad_to_multiple_of and attn_implementation params * add: requested minor changes * add: deepspeed zero compatibility * add: resize embeddings layer with zero3 support for fim model initialization
-
j-gc authored
-
Arthur authored
* post merge update * nit * oups
-
- 08 Mar, 2024 12 commits
-
-
Winston H authored
feat: use `warning_advice` instead of tensorflow warning
-
Zach Mueller authored
* Fix eval thread fork bomb * Keep eval dl persistent and prepare after so free_memory doesn't destroy it * Add note * Quality
-
Fanli Lin authored
[tests] use the correct `n_gpu` in `TrainerIntegrationTest::test_train_and_eval_dataloaders` for XPU (#29307) * fix n_gpu * fix style
-
Yoach Lacombe authored
fix total silence input with no_speech_threshold
-
Yun Dai authored
fix FSDP config
-
Jonatan Kłosko authored
* Make sliding window size inclusive in eager attention * Fix tests
-
liangjs authored
* fix stablelm dropout argument type error * fix docs of _flash_attention_forward * fix all docs of _flash_attention_forward * fix docs of _flash_attention_forward in starcoder2 --------- Co-authored-by:oliang <oliang@tencent.com>
-
Fanli Lin authored
* use torch_device * skip for XPU * Update tests/generation/test_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Clémentine Fourrier authored
-
Wang, Yi authored
* fix image-to-text batch incorrect output issue Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * add ci test Signed-off-by:
Wang, Yi <yi.a.wang@intel.com> * update ci test Signed-off-by:
Wang, Yi <yi.a.wang@intel.com> --------- Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> Signed-off-by:
Wang, Yi <yi.a.wang@intel.com>
-
Fanli Lin authored
* add sacremoses check * fix style * for FlaubertTokenizer * HerbertTokenizer fix * add typeHint * Update src/transformers/testing_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make less skipped * make quality * remove import --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Joao Gante authored
* left-padding test revisited * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-