- 22 Dec, 2021 2 commits
-
-
Patrick von Platen authored
-
Patrick von Platen authored
-
- 15 Dec, 2021 2 commits
- 09 Dec, 2021 2 commits
- 06 Dec, 2021 1 commit
-
-
Julien Chaumond authored
* Replace outdated model tags with their now-canonical pipeline types * spam the CI till it's green
-
- 22 Nov, 2021 2 commits
-
-
Nicholas Broad authored
* remove sum for list flattening * change to chain(*) * make chain object a list * delete empty lines per sgugger's suggestions Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Nicholas Broad <nicholas@nmbroad.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* add test for --config_overrides * remove unneeded parts of the test
-
- 28 Oct, 2021 2 commits
- 26 Oct, 2021 1 commit
-
-
Emanuel Huber authored
Updated masked-language modeling examples in pytorch with convention defined by #12789
-
- 27 Sep, 2021 2 commits
- 24 Sep, 2021 1 commit
-
-
Gunjan Chhablani authored
* Update run_glue.py * Update run_glue.py * Add model creation snippet to other scripts * Fix style
-
- 31 Aug, 2021 2 commits
- 22 Jul, 2021 2 commits
- 08 Jul, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 07 Jul, 2021 1 commit
-
-
Souvic Chakraborty authored
* Validation split percentage to be used for custom data files also Issue same as https://github.com/huggingface/transformers/issues/12406 fixed for pytorch branch run_mlm.py * Validation split added in the right place * Update run_clm.py * validation split added for custom files * Validation split added for custom files * Update run_plm.py * fixed validation split for custom files as input for pytorch examples in lm * Update run_clm_no_trainer.py * args modified
-
- 28 Jun, 2021 2 commits
-
-
Bhadresh Savani authored
* added cotext manager to datasets map * fixed style and spaces * fixed warning of deprecation * changed desc
-
Taha ValizadehAslani authored
Before the code could not be used for validation only because of this line: extension = data_args.train_file.split(".")[-1] was assuming that extension must be extracted from the training dataset. This line would run regardless of the training or validation options of the user. This would lead to an error if the user only wants to run an evaluation only and does not want to do train (because the training file does not exist). I modified it to extract extension from the training file if the user wants to do train and extract it from the validation file if the user wants to run eval. This way the code can be used for both training and validation separately.
-
- 25 Jun, 2021 2 commits
-
-
Bhadresh Savani authored
* added log_level * fix comment * fixed log_level * Trigger CI * Unfied logging * simplified args for log_level
-
Stas Bekman authored
-
- 23 Jun, 2021 2 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
- 17 Jun, 2021 3 commits
-
-
Bhavitvya Malik authored
* update desc for map in all examples * added plm * suggestions
-
Lysandre authored
-
Lysandre authored
-
- 15 Jun, 2021 1 commit
-
-
Sylvain Gugger authored
* [WIP] Model card defaults * finetuned_from default value * Add all mappings to the mapping file * Be more defensive on finetuned_from arg * Add default task tag * Separate tags from tasks * Edge case for dataset * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 14 Jun, 2021 3 commits
-
-
Kumar Abhishek authored
* [lm examples] Replicate --config_overrides addition to other LM examples * Removing no trainer files changes * Update README Co-authored-by:Kumar Abhishek <kabhishek@expedia.com>
-
Nicholas Broad authored
* Use text_column_name variable instead of "text" `text_column_name` was already defined above where I made the changes and it was also used below where I made changes. This is a very minor change. If a dataset does not use "text" as the column name, then the `tokenize_function` will now use whatever column is assigned to `text_column_name`. `text_column_name` is just the first column name if "text" is not a column name. It makes the function a little more robust, though I would assume that 90% + of datasets use "text" anyway. * black formatting * make style Co-authored-by:Nicholas Broad <nicholas@nmbroad.com>
-
Sylvain Gugger authored
* Don't log anything before logging is setup in examples * Last example
-
- 25 May, 2021 2 commits
-
-
Stas Bekman authored
* fix overflow in perplexity calc * use inf * fix
-
Sylvain Gugger authored
* Add option to long only once in multinode training * Use an alternate property
-
- 12 May, 2021 2 commits
- 11 May, 2021 1 commit
-
-
Sylvain Gugger authored
* Autogenerate model cards from the Trainer * ModelCard deprecated * Fix test * Style * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments * Quality * With all metadata * Metadata * Post-merge conflict mess * Data args and all examples * Default license and languages when possible Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 29 Apr, 2021 1 commit
-
-
Sylvain Gugger authored
* Split checkpoint from model_name_or_path in examples * Address review comments * Address review comments
-