"docs/source/en/model_doc/efficientformer.md" did not exist on "78a53d59cb6fa444a95d6be4d15fb3a25e6a8a2e"
- 13 Sep, 2023 1 commit
-
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 05 Sep, 2023 1 commit
-
-
Joao Gante authored
-
- 25 Aug, 2023 1 commit
-
-
Younes Belkada authored
* move deepspeed to `lib_integrations.deepspeed` * more refactor * oops * fix slow tests * Fix docs * fix docs * addess feedback * address feedback * final modifs for PEFT * fixup * ok now * trigger CI * trigger CI again * Update docs/source/en/main_classes/deepspeed.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * import from `integrations` * address feedback * revert removal of `deepspeed` module * revert removal of `deepspeed` module * fix conflicts * ooops * oops * add deprecation warning * place it on the top * put `FutureWarning` * fix conflicts with not_doctested.txt * add back `bitsandbytes` module with a depr warning * fix * fix * fixup * oops * fix doctests --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 11 Jul, 2023 1 commit
-
-
Gaurav Kumbhat authored
*
馃悰 Handle empty gen_kwargs for seq2seq trainer prediction_step fn Signed-off-by:gkumbhat <kumbhat.gaurav@gmail.com> * Update src/transformers/trainer_seq2seq.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Signed-off-by:
gkumbhat <kumbhat.gaurav@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 14 Apr, 2023 1 commit
-
-
Joao Gante authored
-
- 06 Apr, 2023 1 commit
-
-
Joao Gante authored
-
- 27 Mar, 2023 3 commits
-
-
Joao Gante authored
missing None check
-
Joao Gante authored
-
Nathan Fradet authored
* seq2seq trainer and training arguments accepting GenerationConfig arg * seq2seq Trainer and training arguments docstring fixes * Update training_args_seq2seq.py docstring Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Fixing trainer_seq2seq.py docstring Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * seq2seq trainer: legacy gen args back & GenerationConfig created at init * Seq2seq trainer: fix in case gen_config.max_new_tokens is None Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * seq2seq trainer: adding legacy arg retrocompatibility * seq2seq trainer and training arguments accepting GenerationConfig arg * seq2seq Trainer and training arguments docstring fixes * Update training_args_seq2seq.py docstring Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Fixing trainer_seq2seq.py docstring Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * seq2seq trainer: legacy gen args back & GenerationConfig created at init * Seq2seq trainer: fix in case gen_config.max_new_tokens is None Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * seq2seq trainer: adding legacy arg retrocompatibility * seq2seq trainer: evaluate and predict untouched * Apply suggestions from code review Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * seq2seq trainer: adding init args, keeping IDEs hints --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 13 Mar, 2023 1 commit
-
-
Joao Gante authored
* Let generate pick its inputs * fix squad seq2seq example
-
- 06 Feb, 2023 1 commit
-
-
Sylvain Gugger authored
* Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies
-
- 31 Jan, 2023 1 commit
-
-
regisss authored
Do not log the generation config for each iteration
-
- 18 Oct, 2022 1 commit
-
-
Ivan Sedykh authored
-
- 01 Sep, 2022 1 commit
-
-
kumapo authored
* reflect max_new_tokens in gen_kwargs to `trainer.generate()` * reflect max_new_tokens in `Seq2SeqTrainer` * remove unnecessary variable * Trigger CI * fix style
-
- 06 Jul, 2022 1 commit
-
-
Sylvain Gugger authored
* Link to the Datasets doc * Remove unwanted file
-
- 22 Jun, 2022 1 commit
-
-
Eran Hirsch authored
Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict` (#17805) * Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict` * Add all generate parameters to `Seq2SeqTrainer`, and also to `QuestionAnsweringSeq2SeqTrainer` which overrides it * Remove `self._num_beams` from trainer classes * - Run fixup - Fix "Constraint" not exposed - Fix synced_gpus to actually read from param * Use kwargs * Copy kwargs before making changes to it * Fix style issues unused imports
-
- 25 May, 2022 1 commit
-
-
Animesh Jain authored
* Support compilation via Torchdynamo, AOT Autograd, NVFuser * Address comments * Lint * Stas comments - missing quality test * Lintere * Quality test * Doc lint * Reset CUDA peak mem * Add CustomTrainer * require a single gpu Co-authored-by:Stas Bekman <stas@stason.org>
-
- 05 Apr, 2022 1 commit
-
-
John Giorgi authored
If global_attention_mask is found in the models inputs (used by certain models, like LED) in the prediction_step method of Seq2SeqTrainer, it is added to the gen_kwargs, which are passed to model.decode(). This allows us to properly set the global attention when decoding.
-
- 10 Feb, 2022 1 commit
-
-
NielsRogge authored
Co-authored-by:Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
-
- 11 Jan, 2022 1 commit
-
-
JejuWayfarer authored
It solves the problem that metric_key_prefix is different from trainer.
-
- 27 Dec, 2021 2 commits
-
-
Stas Bekman authored
* redo sans examples * style
-
Sylvain Gugger authored
* New doc styler * Fix issue with args at the start * Code sample fixes * Style code examples in MDX * Fix more patterns * Typo * Typo * More patterns * Do without black for now * Get more info in error * Docstring style * Re-enable check * Quality * Fix add_end_docstring decorator * Fix docstring
-
- 23 Dec, 2021 1 commit
-
-
Patrick von Platen authored
* up * save * correct * up * correct more * up * up * up * up * up * correct * fix tf * fix * remove tokenizer
-
- 21 Dec, 2021 1 commit
-
-
Sylvain Gugger authored
* Convert docstrings of all configurations and tokenizers * Processors and fixes * Last modeling files and fixes to models * Pipeline modules * Utils files * Data submodule * All the other files * Style * Missing examples * Style again * Fix copies * Say bye bye to rst docstrings forever
-
- 20 Dec, 2021 1 commit
-
-
Patrick von Platen authored
* [Seq2SeqTrainer] Remove model input name hack * Update src/transformers/trainer_seq2seq.py * make style * finish
-
- 07 Dec, 2021 2 commits
-
-
Stas Bekman authored
* [trainer] conditional ctx managers into one wrapper * workaround for contextlib.nullcontext for py<3.7 * Update src/transformers/trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * one more autocast * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
TranSirius authored
Fix a Bug, trainer_seq2seq.py, in the else branch at Line 172, generation_inputs should be a dict (#14546) * fix bug, trainer_seq2seq.py, Line 172, generation_inputs must be a dict before feeding into self.model.generation() * fix bug, trainer_seq2seq.py, Line 172, generation_inputs must be a dict before feeding into self.model.generation()
-
- 29 Oct, 2021 1 commit
-
-
Haram Lee authored
-
- 28 Oct, 2021 1 commit
-
-
NielsRogge authored
* First draft * Make tuple output more readable * Replace assertions by value errors * Make it possible to predict_with_generate for vision and speech models * Adapt Seq2SeqTrainer to work with VisionEncoderDecoder/SpeechEncoderDecoder * Add deprecation warning * Add copied from statements to vision and speech encoder decoders * Fix failing test * Apply @patrickvonplaten's suggestion * Use reshape instead of view for consistency
-
- 31 Aug, 2021 1 commit
-
-
Sylvain Gugger authored
* Add generate kwargs to Seq2SeqTrainingArguments * typo * Address review comments + doc * Style
-
- 19 Aug, 2021 1 commit
-
-
Allan Lin authored
* Update torch.utils.data namespaces to the latest. * Format * Update Dataloader. * Style
-
- 27 Jul, 2021 1 commit
-
-
cchen-dialpad authored
* set max_length and num_beams only when non None * fix instance variables * fix code style
-
- 02 Jun, 2021 1 commit
-
-
Stas Bekman authored
* move code and docs * style * moved * restore
-
- 26 Apr, 2021 1 commit
-
-
LSinev authored
-
- 08 Apr, 2021 1 commit
-
-
Stas Bekman authored
* synced gpus * fix * fix * need to use t5-small for quality tests * notes * complete merge * fix a disappearing std stream problem * start zero3 tests * wip * tune params * sorting out the pre-trained model loading * reworking generate loop wip * wip * style * fix tests * split the tests * refactor tests * wip * parameterized * fix * workout the resume from non-ds checkpoint pass + test * cleanup * remove no longer needed code * split getter/setter functions * complete the docs * suggestions * gpus and their compute capabilities link * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * style * remove invalid paramgd * automatically configure zero3 params that rely on hidden size * make _get_resized_embeddings zero3-aware * add test exercising resize_token_embeddings() * add docstring Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 01 Feb, 2021 1 commit
-
-
Sylvain Gugger authored
* Remove subclass for sortish sampler * Use old Seq2SeqTrainer in script * Styling
-
- 27 Dec, 2020 1 commit
-
-
Patrick von Platen authored
-
- 22 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
* Add label smoothing in Trainer * Add options for scheduler and Adafactor in Trainer * Put Seq2SeqTrainer in the main lib * Apply suggestions from code review Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments and adapt scripts * Documentation * Move test not using script to tests folder Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-