- 03 Apr, 2024 1 commit
-
-
Yih-Dar authored
update Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 02 Apr, 2024 11 commits
-
-
Mario 艩a拧ko authored
-
Joao Gante authored
* fix norm * fix logits processors doctests
-
Nicolas Patry authored
* Hard error when ignoring tensors. (#27484) * [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add small tests. * Dead variable. * Fixup. * Fixing tied_Weights_keys on generic models. * Fixup + T5 encoder/decoder tying (with different layers) * Code quality. * Dynamic member. * trigger * Fixing encoder name for other types of encoder/decoder combos. * Fix scoping. * Update .github/workflows/self-scheduled.yml Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fixing the tied_weights after the call. --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Minsub Lee (Matt) authored
* Fix skip_special_tokens process for Wav2Vec2CTCTokenizer._decode * Fix skip_special_tokens for Wav2Vec2CTCTokenizer._decode * Exclude pad_token filtering since it is used as CTC-blank token * Add small test for skip_special_tokens * Update decoding test for added new token
-
Michael authored
-
Yoach Lacombe authored
* add FA2 to o.g Musicgen * make style * add FA2 support to Musicgen Melody * add generation FA2 tests to o.g Musicgen * make style and fix copies * add Musicgen to FA2 docs + deprecate list * add sdpa supports to Musicgen's * make style and fix copies * refactor attention implementation arguments * add Copied from to sdpa tests * add copied form in sdpa tests melody * add copied for FA2 generation tests * add FA2 inference copied from * make style
-
th茅o gigant authored
* fix issue with logit processor in beam search in Flax * adding FlaxNoRepeatNGramLogitsProcessor class + unit test * style correction and code verification * add FlaxNoRepeatNGramLogitsProcessor to the test_processor_list and test_processor_list_jitted tests * fix an issue where ngrams are banned only if they appear ==1 time + update description of get_previous_ngrams * replace non-jit compatible masking of ngrams that are not yet generated with jittable version * Revert "fix issue with logit processor in beam search in Flax" This reverts commit 09b70d7e4dc32d0cc4db61af09a835a9cd238b50. * add FlaxNoRepeatNGramLogitsProcessor to _get_logits_processor * change the method of casting to boolean of banned tokens indices * fix code style * remove some useless operations + significantly faster computation of update indices using jax.lax.fori_loop * remove useless loop iterations * set some variables that were calculated and used multiple times * fix format
-
Marc Sun authored
fix bug
-
Hovnatan Karapetyan authored
* Fix sinusoidal_embeddings in FlaubertModel * Fix for Informer * Fix for XLM * Move sinusoidal emb for XLM * Move sinusoidal emb for Flaubert * Small cleanup * Add comments on tests code copied from * Add with Distilbert->
-
Arthur authored
* fix bug and add tests * nit * otherway to get the cur len instead of attention mask * more places where this might have been broken * nit * oups * inputs_embeds vs input_embeds * test generated outptus * style * nit * fix * skip failing biogpt
-
Steven Liu authored
* update * feedback
-
- 01 Apr, 2024 4 commits
-
-
Joao Gante authored
-
Fanli Lin authored
[tests] fix the wrong output in `ImageToTextPipelineTests.test_conditional_generation_llava` (#29975) bug fix
-
Arthur authored
* fix copies * nit * style * Update utils/check_copies.py
-
Yoach Lacombe authored
* fix FA2 tests * refactor inference test name
-
- 31 Mar, 2024 1 commit
-
-
Zach Mueller authored
* Start rework * Fix failing test * Include max * Update src/transformers/trainer.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 30 Mar, 2024 6 commits
-
-
TechxGenus authored
fix awq quant
-
Bo Zheng authored
* Update qwen2_moe.md * update link of blogpost. * fixup --------- Co-authored-by:bozheng-hit <dsoul0621@gmail.com>
-
Gary Wang authored
Fixes #29690
-
Alexander Jipa authored
Co-authored-by:Alexander Jipa <azzhipa@amazon.com>
-
Jacky Lee authored
* improve: error message for best model metric * update: raise warning instead of error
-
Jacky Lee authored
fix: rope_theta for open llama
-
- 29 Mar, 2024 2 commits
-
-
fzyzcjy authored
* with with * style
-
Yih-Dar authored
* fix * revert for qwen2 * revert for qwen2 * update * update --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 28 Mar, 2024 15 commits
-
-
MariaHei authored
Trainer with PyTorch now requires accelerate to be installed. Partly resolves huggingface/transformers#29174
-
Arthur authored
* fix * fix test * style * nit * rather rely on concert token to id * fix quality * Update src/transformers/convert_slow_tokenizer.py
-
VINAYAKK GARG authored
Fix doc issue in DebertaV2Config class Co-authored-by:Vinayakk Garg <vigar@akamai.com>
-
Arthur authored
* fi xbc? * nit
-
Yu Chin Fabian Lim authored
* add gradient_accumulation_kwargs to AcceleratorConfig * add suggestions from @muellerzr to docstrings, new behavior and tests * Documentation suggestions from @muellerz Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * addressed @muellerzr comments regarding tests and test utils * moved accelerate version to top of file. * @muellerzr's variable fix Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * address @amyeroberts. fix tests and docstrings * address @amyeroberts additional suggestions --------- Co-authored-by:
Yu Chin Fabian Lim <flim@sg.ibm.com> Co-authored-by:
Zach Mueller <muellerzr@gmail.com>
-
Arthur authored
[ `TokenizationLlama`] fix the way we convert tokens to strings to keep leading spaces
馃毃 breaking fix (#29453) * nit * update test and fix test * fixup -
Arthur authored
* nit * update * oups * Update src/transformers/models/mamba/modeling_mamba.py Co-authored-by:
Lysandre Debut <hi@lysand.re> --------- Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
Joao Gante authored
* add hard rope scaling test * make fixup * quick rope scaling tests * add copy statements
-
Christopher Keibel authored
* add functions to get number of params which require grad, get optimizer group for parameters and get learning rates of param groups to trainer.py * add tests and raise ValueError when optimizer is None * add second layer to test and freeze its weigths * check if torch is available before running tests * use decorator to check if torch is available Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix test indentation Co-authored-by:
Zach Mueller <muellerzr@gmail.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Zach Mueller <muellerzr@gmail.com>
-
amyeroberts authored
* Safe import of LRScheduler * Update src/transformers/trainer_pt_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/trainer_pt_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix up --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Aymeric Roucher authored
-
Joao Gante authored
* replace torch.testing.assert_allclose by torch.testing.assert_close * missing atol rtol
-
Fanli Lin authored
fix typo
-
Eduardo Pacheco authored
* First commit to add flash attention 2 for GPT-2 * more improvements * Make GPT2 pass tests and fixed Decison Transformers copies * Fixed missing arg * fix copies * Added expected speedup * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Added test * Fixed attn attribute * Update docs/source/en/model_doc/gpt2.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/gpt2.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update Decision transformer attentions * More updates * Passing tests * Fix copies * Fix copies part 2 * Decision transformer updates * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix copies * Decision transformer not supporting flash attn * Addressed comments * Addressed comments * Addressed comments --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Arthur authored
* add doc warning * fix build pr
-