- 09 Jun, 2023 8 commits
-
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Sourab Mangrulkar authored
* fix the deepspeed test failures * apex fix * FSDP save ckpt fix * Update src/transformers/trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Joao Gante authored
-
Matt authored
-
Younes Belkada authored
* fix bnb config json serialization * forward contrib credits from discussions --------- Co-authored-by:Andrechang <Andrechang@users.noreply.github.com>
-
Elliott Wang authored
-
Arthur authored
* preventllama fast from returning token type ids * remove type hints * normalised False
-
- 08 Jun, 2023 9 commits
-
-
Yih-Dar authored
* fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Serge Panev authored
* Fix typo in Llama docstrings Signed-off-by:
Serge Panev <spanev@nvidia.com> * Update Signed-off-by:
Serge Panev <spanev@nvidia.com> * make style Signed-off-by:
Serge Panev <spanev@nvidia.com> --------- Signed-off-by:
Serge Panev <spanev@nvidia.com>
-
Radamés Ajna authored
* add trust_remote_code option * require_torch
-
Younes Belkada authored
[`GPT2`] Add correct keys on `_keys_to_ignore_on_load_unexpected` on all child classes of `GPT2PreTrainedModel` (#24113) * add correct keys on `_keys_to_ignore_on_load_unexpected` * oops
-
Marc Sun authored
* fix get_keys_to_not_convert funct * Fix style
-
Sylvain Gugger authored
-
Younes Belkada authored
* v1 * some refactor - add ST format as well * fix * add `ADAPTER_WEIGHTS_NAME` & `ADAPTER_SAFE_WEIGHTS_NAME`
-
Sourab Mangrulkar authored
-
Sadra Barikbin authored
-
- 07 Jun, 2023 18 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Add AzureOpenAiAgent * quality * Update src/transformers/tools/agents.py Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr> --------- Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr>
-
Zachary Mueller authored
* Min accelerate * Also min version * Min accelerate * Also min version * To different minor version * Empty
-
Sourab Mangrulkar authored
* fix mixed precision prep during eval only mode * update to address comments * update to reflect the changes in accelerate
-
Sylvain Gugger authored
* Do not prepare lr scheduler as it as the right number of steps * Trigger CI * Trigger CI * Trigger CI * Add fake comment * Remove fake comment * Trigger CI please!
-
Sourab Mangrulkar authored
* fix executable batch size issue * fix * undo
-
Mishig authored
fix base workflow name
-
Sylvain Gugger authored
* Fix expected value in tests of the test fetcher * Fix trigger for repo util tests
-
Mishig authored
-
Matt authored
* Let's see if we can use the smallest possible dummies * Make GPT-2's dummies a little longer * Just use (1,2) as the default shape * Update other dummies in sync * Correct imports for Keras 2.13 * Shrink the Wav2Vec2 dummies
-
Yih-Dar authored
* fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Younes Belkada authored
* fix skip modules test * oops * address comments
-
Michael Benayoun authored
Fix is_optimum_neuron_available
-
Younes Belkada authored
add `safe_serialization` in push_to_hub
-
Younes Belkada authored
* support PEFT models when saving the model using trainer * fixup
-
YangLiu authored
* Add support for non-rust implemented tokenization for `__getitem__` method. * Update for error message on adding new sub-branch for `__item__` method. --------- Co-authored-by:liuyang17 <liuyang17@zhihu.com>
-
Patrick von Platen authored
* [Wav2Vec2] Fix torch srcipt * fix more
-
Joao Gante authored
increase atol
-
- 06 Jun, 2023 5 commits
-
-
Sylvain Gugger authored
* Fix model load when it has both code on the Hub and locally * Add input check with timeout * Add tests * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Some non-saved stuff * Add feature extractors * Add image processor * Add model * Add processor and tokenizer * Reduce timeout --------- Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr>
-
Sylvain Gugger authored
* Fix device placement for model-parallelism in generate for encoder/decoders * Remove debug statements
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Edward Z. Yang authored
* Use new parametrization based weight norm if available See https://github.com/pytorch/pytorch/pull/103001 Signed-off-by:
Edward Z. Yang <ezyang@meta.com> * handle copies Signed-off-by:
Edward Z. Yang <ezyang@meta.com> * black Signed-off-by:
Edward Z. Yang <ezyang@meta.com> --------- Signed-off-by:
Edward Z. Yang <ezyang@meta.com>
-
Matt authored
* A fun new PR where I break the entire codebase again * A fun new PR where I break the entire codebase again * Handle cross-attention * Move calls to model(model.dummy_inputs) to the new build() method * Seeing what fails with the build context thing * make fix-copies * Let's see what fails with new build methods * Fix the pytorch crossload build calls * Fix the overridden build methods in vision_text_dual_encoder * Make sure all our build methods set self.built or call super().build(), which also sets it * make fix-copies * Remove finished TODO * Tentatively remove unneeded (?) line * Transpose b in deberta correctly and remove unused threading local * Get rid of build_with_dummies and all it stands for * Rollback some changes to TF-PT crossloading * Correctly call super().build()
-