- 02 May, 2020 3 commits
-
-
William Falcon authored
-
William Falcon authored
-
Stefan Schweter authored
* ner: parse args from .args file or JSON * examples: mention json-based configuration file support for run_ner script
-
- 01 May, 2020 1 commit
-
-
Julien Chaumond authored
[qol] example scripts: parse args from .args file or JSON
-
- 29 Apr, 2020 1 commit
-
-
Julien Chaumond authored
* [file_utils] use_cdn + documentation * Move to cdn. urls for weights * [urls] Hotfix for bert-base-japanese
-
- 28 Apr, 2020 2 commits
-
-
Sam Shleifer authored
* add known 3rd party to setup.cfg * comment * Update CONTRIBUTING.md Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Patrick von Platen authored
* fix empty prompt * fix length in generation pipeline
-
- 24 Apr, 2020 2 commits
-
-
Julien Chaumond authored
Close #3921
-
Cola authored
* Shuffle train subset * Cleaner shuffle
-
- 22 Apr, 2020 2 commits
-
-
Julien Chaumond authored
-
Julien Chaumond authored
* doc * [tests] Add sample files for a regression task * [HUGE] Trainer * Feedback from @sshleifer * Feedback from @thomwolf + logging tweak * [file_utils] when downloading concurrently, get_from_cache will use the cached file for subsequent processes * [glue] Use default max_seq_length of 128 like before * [glue] move DataTrainingArguments around * [ner] Change interface of InputExample, and align run_{tf,pl} * Re-align the pl scripts a little bit * ner * [ner] Add integration test * Fix language_modeling with API tweak * [ci] Tweak loss target * Don't break console output * amp.initialize: model must be on right device before * [multiple-choice] update for Trainer * Re-align to 827d6d6e
-
- 20 Apr, 2020 3 commits
-
-
Andrey Kulagin authored
-
Jared T Nielsen authored
* Add qas_id * Fix incorrect name in squad.py * Make output files optional for squad eval
-
Sam Shleifer authored
-
- 18 Apr, 2020 1 commit
-
-
Thomas Wolf authored
* First pass on utility classes and python tokenizers * finishing cleanup pass * style and quality * Fix tests * Updating following @mfuntowicz comment * style and quality * Fix Roberta * fix batch_size/seq_length inBatchEncoding * add alignement methods + tests * Fix OpenAI and Transfo-XL tokenizers * adding trim_offsets=True default for GPT2 et RoBERTa * style and quality * fix tests * add_prefix_space in roberta * bump up tokenizers to rc7 * style * unfortunately tensorfow does like these - removing shape/seq_len for now * Update src/transformers/tokenization_utils.py Co-Authored-By:
Stefan Schweter <stefan@schweter.it> * Adding doc and docstrings * making flake8 happy Co-authored-by:
Stefan Schweter <stefan@schweter.it>
-
- 16 Apr, 2020 3 commits
-
-
Sam Shleifer authored
renames `run_bart_sum.py` to `finetune.py`
-
Patrick von Platen authored
* Refactored use of newstest2013 to newstest2014. Fixed bug where argparse consumed first command line argument as model_size argument rather than using default model_size by forcing explicit --model_size flag inclusion * More pythonic file handling through 'with' context * COSMETIC - ran Black and isort * Fixed reference to number of lines in newstest2014 * Fixed failing test. More pythonic file handling * finish PR from tholiao * remove outcommented lines * make style * make isort happy Co-authored-by:Thomas Liao <tholiao@gmail.com>
-
Davide Fiocco authored
-
- 15 Apr, 2020 1 commit
-
-
Sam Shleifer authored
- adds pytorch-lightning dependency
-
- 14 Apr, 2020 1 commit
-
-
Patrick von Platen authored
* remove output_past from pt * make style * add optional input length for gpt2 * add use cache to prepare input * save memory in gpt2 * correct gpt2 test inputs * make past input optional for gpt2 * finish use_cache for all models * make style * delete modeling_gpt2 change in test file * correct docstring * correct is true statements for gpt2
-
- 13 Apr, 2020 1 commit
-
-
elk-cloner authored
-
- 10 Apr, 2020 5 commits
-
-
Jin Young Sohn authored
-
Jin Young Sohn authored
* Initial commit to get BERT + run_glue.py on TPU * Add README section for TPU and address comments. * Cleanup TPU bits from run_glue.py (#3) TPU runner is currently implemented in: https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py. We plan to upstream this directly into `huggingface/transformers` (either `master` or `tpu`) branch once it's been more thoroughly tested. * Cleanup TPU bits from run_glue.py TPU runner is currently implemented in: https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py . We plan to upstream this directly into `huggingface/transformers` (either `master` or `tpu`) branch once it's been more thoroughly tested. * No need to call `xm.mark_step()` explicitly (#4) Since for gradient accumulation we're accumulating on batches from `ParallelLoader` instance which on next() marks the step itself. * Resolve R/W conflicts from multiprocessing (#5) * Add XLNet in list of models for `run_glue_tpu.py` (#6) * Add RoBERTa to list of models in TPU GLUE (#7) * Add RoBERTa and DistilBert to list of models in TPU GLUE (#8) * Use barriers to reduce duplicate work/resources (#9) * Shard eval dataset and aggregate eval metrics (#10) * Shard eval dataset and aggregate eval metrics Also, instead of calling `eval_loss.item()` every time do summation with tensors on device. * Change defaultdict to float * Reduce the pred, label tensors instead of metrics As brought up during review some metrics like f1 cannot be aggregated via averaging. GLUE task metrics depends largely on the dataset, so instead we sync the prediction and label tensors so that the metrics can be computed accurately on those instead. * Only use tb_writer from master (#11) * Apply huggingface black code formatting * Style * Remove `--do_lower_case` as example uses cased * Add option to specify tensorboard logdir This is needed for our testing framework which checks regressions against key metrics writtern by the summary writer. * Using configuration for `xla_device` * Prefix TPU specific comments. * num_cores clarification and namespace eval metrics * Cache features file under `args.cache_dir` Instead of under `args.data_dir`. This is needed as our test infra uses data_dir with a read-only filesystem. * Rename `run_glue_tpu` to `run_tpu_glue` Co-authored-by:
LysandreJik <lysandre.debut@reseau.eseo.fr>
-
Julien Chaumond authored
-
Julien Chaumond authored
* [examples] Generate argparsers from type hints on dataclasses * [HfArgumentParser] way simpler API * Restore run_language_modeling.py for easier diff * [HfArgumentParser] final tweaks from code review
-
Julien Chaumond authored
* Big cleanup of `glue_convert_examples_to_features` * Use batch_encode_plus * Cleaner wrapping of glue_convert_examples_to_features for TF @lysandrejik * Cleanup syntax, thanks to @mfuntowicz * Raise explicit error in case of user error
-
- 07 Apr, 2020 3 commits
-
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Patrick von Platen authored
* improve and add features to benchmark utils * update benchmark style * remove output files
-
- 06 Apr, 2020 1 commit
-
-
Ethan Perez authored
* Fix RoBERTa/XLNet Pad Token in run_multiple_choice.py `convert_examples_to_fes atures` sets `pad_token=0` by default, which is correct for BERT but incorrect for RoBERTa (`pad_token=1`) and XLNet (`pad_token=5`). I think the other arguments to `convert_examples_to_features` are correct, but it might be helpful if someone checked who is more familiar with this part of the codebase. * Simplifying change to match recent commits
-
- 02 Apr, 2020 3 commits
-
-
Nicolas authored
* Resizing embedding matrix after sending it to the optimizer prevents from updating the newly resized matrix. * Remove space for style matter
-
Mark Kockerbeck authored
-
Patrick von Platen authored
* replace heavy t5 models with tiny random models as was done by sshleifer * fix isort
-
- 01 Apr, 2020 1 commit
-
-
Julien Chaumond authored
* Start cleaning examples * Fixup
-
- 31 Mar, 2020 1 commit
-
-
Patrick von Platen authored
* fix conflicts * add model size argument to summarization * correct wrong import * fix isort * correct imports * other isort make style * make style
-
- 30 Mar, 2020 3 commits
-
-
Ethan Perez authored
* Using loaded checkpoint with --do_predict Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix). * Update checkpoint loading * Fixing model loading
-
Sam Shleifer authored
-
Julien Plu authored
* Update the NER TF script to remove the softmax and make the pad token label id to -1 * Reformat the quality and style Co-authored-by:Julien Plu <julien.plu@adevinta.com>
-
- 29 Mar, 2020 1 commit
-
-
Sam Shleifer authored
-
- 27 Mar, 2020 1 commit
-
-
Patrick von Platen authored
* force bleu * fix wrong file name * rename file * different filenames for each example test * test files should clean up after themselves * test files should clean up after themselves * do not force bleu * correct typo * fix isort
-