- 02 Apr, 2021 1 commit
-
-
versis authored
-
- 31 Mar, 2021 3 commits
-
-
Hemil Desai authored
* Add initial script for finetuning MLM models with accelerate * Add evaluation metric calculation * Fix bugs * Use no_grad on evaluation * update script docstring * Update examples/language-modeling/run_mlm_no_trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * PR feedback * Fix CI failure * Update examples/language-modeling/run_mlm_no_trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
* First third * Styling and fix mistake * Quality * All the rest * Treat %s and %d * typo * Missing ) * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
WybeKoper authored
* Fixed typos * Removed legacy colab notebook from readme Co-authored-by:WybeKoper <WybeKoper@users.noreply.github.com>
-
- 30 Mar, 2021 2 commits
-
-
Yih-Dar authored
-
Philipp Schmid authored
* added py7zr * comment out check_min for sagemaker test * added min version again
-
- 29 Mar, 2021 5 commits
-
-
Daniel Stancl authored
* Initial commit * Another bunch of updates * make style quliaty + delete debug arg from bash script * Use compue_metrics func * Do a few fixes * Add copyright * Fix typos
-
Sylvain Gugger authored
-
Daniel Stancl authored
* Add NER example with accelerate library * This commit contains the first (yet really unfinished) version of a script for showing how to train HuggingFace model with their new accelerate library. * Fix metric calculation * make style quality * mv ner_no_trainer to token-classification dir * Delete --debug flag from running script * hf_datasets -> raw_datasets * Make a few slight adjustments * Add an informative comment + rewrite a help comment * Change header * Fix a few things * Enforce to use fast tokenizers only * DataCollatorWithPadding -> DataCollatorForTokenClassification * Change bash script: python3 -> accelerate launch * make style * Add a few missing things (see below) * Add a max-lenghth padding to predictions and labels to enable accelerate gather functionality * Add PyTorch no trainer example to the example README.md * Remove --do-train from args as being redundant for now * DataCollatorWithPadding -> DataCollatorForTokenClassification * Remove some obsolete args.do_train conditions from the script * Delete --do_train from bash running script * Delete use_slow_tokenizer from args * Add unintentionally removed flag --label_all_tokens * Delete --debug flag from running script
-
WybeKoper authored
Co-authored-by:WybeKoper <WybeKoper@users.noreply.github.com>
- 28 Mar, 2021 1 commit
-
-
Bhadresh Savani authored
-
- 26 Mar, 2021 1 commit
-
- 25 Mar, 2021 1 commit
-
-
Jethro Kuan authored
Use the correct variable (raw_datasets) instead of the module (datasets) where appropriate.
-
- 23 Mar, 2021 1 commit
-
-
Bhadresh Savani authored
* added predict stage * added test keyword in exception message * removed example specific saving predictions * fixed f-string error * removed extra line Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
- 22 Mar, 2021 6 commits
-
-
Eliza Szczechla authored
Co-authored-by:Eliza <eliza@habanero.tiger.com.pl>
-
Boris Dayma authored
* feat: ensure unique artifact id * feat: allow manual init * fix: simplify reinit logic * fix: no dropped value + immediate commits * fix: wandb use in sagemaker * docs: improve documenation and formatting * fix: typos * docs: improve formatting
-
Stas Bekman authored
Takes care of: https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/jinja2/open @LysandreJik Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
dependabot[bot] authored
Bumps [jinja2](https://github.com/pallets/jinja) from 2.11.2 to 2.11.3. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/master/CHANGES.rst) - [Commits](https://github.com/pallets/jinja/compare/2.11.2...2.11.3 ) Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Qiushi Pan authored
Fix typo.
-
Patrick von Platen authored
-
- 21 Mar, 2021 4 commits
-
-
Suraj Patil authored
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Suraj Patil authored
-
- 19 Mar, 2021 6 commits
-
-
Julien Chaumond authored
* wording/typos tweaks * Make model upload instructions simpler
-
Patrick von Platen authored
-
Sylvain Gugger authored
* Expand a bit the presentation of examples * Apply suggestions from code review Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
Bhadresh Savani authored
* added prediction stage and eval fix * style correction * removed extra lines
-
Patrick von Platen authored
* finish * fix * fix * fix * fix
-
Stas Bekman authored
Following up on a security alert: https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/Pillow/open
-
- 18 Mar, 2021 8 commits
-
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Patrick von Platen authored
* upload * upload fine-tuning script * improve * adapt * Apply suggestions from code review * correct * upload * finalize * remove @ * correct typos
-
Stas Bekman authored
* [examples/seq2seq] fix t5 examples This PR: * fixes T5 examples to include `--source_prefix` - it's **not** optional. If you give it a try you will see that you get 10x worse bleu scores w/o it. w/ `27.6849`, w/ `2.374` * added a normal translation example w/o the peculiarities of MBart and T5 * reduces the default max samples to 50 so it's much faster to test quickly summarization seems to be broken for t5 score-wise: https://github.com/huggingface/transformers/issues/10733 @sgugger * specify explicitly the t5 models requiring the special handling * one more * update the t5 summarization example to use cnn_dailymail * move max*samples into the top level README.md * better wording * better wording
-
Julien Chaumond authored
* do not gobble certain kinds of requests.ConnectionError * Apply review comments Co-authored-by:Lysandre <lysandre.debut@reseau.eseo.fr>
-
Suraj Patil authored
* add initial script * finish script * add shell script example * accept chars_to_ignor as cl arg * align the script with other example scripts * add torchaudio dep
-
Mohamed El-Geish authored
* wav2vec2: support datasets other than LibriSpeech * Formatting run_asr.py to pass code quality test * bundled orthography options and added verbose logs * fixing a typo in timit fine-tuning script * update comment for clarity * resize_lm_head and load custom vocab from file * adding a max_duration_in_seconds filter * do not assign `duration_filter` lambda, use a def * log untransliterated text as well * fix base model for arabic * fix duration filter when target_sr is not set * drop duration_in_seconds when unneeded * script for wav2vec2-large-lv60-timit-asr * fix for "tha" in arabic corpus (huggingface#10581) * adding more options to work with common_voice * PR feedback (huggingface#10581) * small README change
-
- 17 Mar, 2021 1 commit
-
-
Stas Bekman authored
* document resuming in examples * fix * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * put trainer code last, adjust notes Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-