- 11 Apr, 2020 1 commit
-
-
HUSEIN ZOLKEPLI authored
* add bert bahasa readme * update readme * update readme * added xlnet * added tiny-bert and fix xlnet readme * added albert base * added albert tiny
-
- 10 Apr, 2020 7 commits
-
-
Jin Young Sohn authored
-
Anthony MOI authored
-
Jin Young Sohn authored
* Initial commit to get BERT + run_glue.py on TPU * Add README section for TPU and address comments. * Cleanup TPU bits from run_glue.py (#3) TPU runner is currently implemented in: https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py. We plan to upstream this directly into `huggingface/transformers` (either `master` or `tpu`) branch once it's been more thoroughly tested. * Cleanup TPU bits from run_glue.py TPU runner is currently implemented in: https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py . We plan to upstream this directly into `huggingface/transformers` (either `master` or `tpu`) branch once it's been more thoroughly tested. * No need to call `xm.mark_step()` explicitly (#4) Since for gradient accumulation we're accumulating on batches from `ParallelLoader` instance which on next() marks the step itself. * Resolve R/W conflicts from multiprocessing (#5) * Add XLNet in list of models for `run_glue_tpu.py` (#6) * Add RoBERTa to list of models in TPU GLUE (#7) * Add RoBERTa and DistilBert to list of models in TPU GLUE (#8) * Use barriers to reduce duplicate work/resources (#9) * Shard eval dataset and aggregate eval metrics (#10) * Shard eval dataset and aggregate eval metrics Also, instead of calling `eval_loss.item()` every time do summation with tensors on device. * Change defaultdict to float * Reduce the pred, label tensors instead of metrics As brought up during review some metrics like f1 cannot be aggregated via averaging. GLUE task metrics depends largely on the dataset, so instead we sync the prediction and label tensors so that the metrics can be computed accurately on those instead. * Only use tb_writer from master (#11) * Apply huggingface black code formatting * Style * Remove `--do_lower_case` as example uses cased * Add option to specify tensorboard logdir This is needed for our testing framework which checks regressions against key metrics writtern by the summary writer. * Using configuration for `xla_device` * Prefix TPU specific comments. * num_cores clarification and namespace eval metrics * Cache features file under `args.cache_dir` Instead of under `args.data_dir`. This is needed as our test infra uses data_dir with a read-only filesystem. * Rename `run_glue_tpu` to `run_tpu_glue` Co-authored-by:
LysandreJik <lysandre.debut@reseau.eseo.fr>
-
Julien Chaumond authored
-
Julien Chaumond authored
* [examples] Generate argparsers from type hints on dataclasses * [HfArgumentParser] way simpler API * Restore run_language_modeling.py for easier diff * [HfArgumentParser] final tweaks from code review
-
Sam Shleifer authored
- support mbart-en-ro weights - add MBartTokenizer
-
Julien Chaumond authored
* Big cleanup of `glue_convert_examples_to_features` * Use batch_encode_plus * Cleaner wrapping of glue_convert_examples_to_features for TF @lysandrejik * Cleanup syntax, thanks to @mfuntowicz * Raise explicit error in case of user error
-
- 09 Apr, 2020 5 commits
-
-
Patrick von Platen authored
* initial commit to add decoder caching for T5 * better naming for caching * finish T5 decoder caching * correct test * added extensive past testing for T5 * clean files * make tests cleaner * improve docstring * improve docstring * better reorder cache * make style * Update src/transformers/modeling_t5.py Co-Authored-By:
Yacine Jernite <yjernite@users.noreply.github.com> * make set output past work for all layers * improve docstring * improve docstring Co-authored-by:
Yacine Jernite <yjernite@users.noreply.github.com>
-
calpt authored
-
Julien Chaumond authored
-
LysandreJik authored
cc @julien-c
-
Teven authored
-
- 08 Apr, 2020 6 commits
-
-
Lysandre Debut authored
* Updating modeling tf files; adding tests * Merge `encode_plus` and `batch_encode_plus`
-
LysandreJik authored
-
Julien Chaumond authored
-
Seyone Chithrananda authored
* created readme.md * update readme with fixes Fixes from PR comments
-
Lorenzo Ampil authored
-
- 07 Apr, 2020 8 commits
-
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Patrick von Platen authored
* fix egde gase for bert tokenization * add Lysandres comments for improvement * use new is_pretokenized_flag
-
Patrick von Platen authored
* improve and add features to benchmark utils * update benchmark style * remove output files
-
Michael Pang authored
* Optimize causal mask using torch.where Instead of multiplying by 1.0 float mask, use torch.where with a bool mask for increased performance. * Maintain compatiblity with torch 1.0.0 - thanks for PR feedback * Fix typo * reformat line for CI
-
Sam Shleifer authored
-
Myle Ott authored
-
Julien Chaumond authored
Close #3639 + spurious warning mentioned in #3227 cc @lysandrejik @thomwolf
-
- 06 Apr, 2020 13 commits
-
-
Teven authored
Co-authored-by:TevenLeScao <teven.lescao@gmail.com>
-
Funtowicz Morgan authored
* Renamed num_added_tokens to num_special_tokens_to_add Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Cherry-Pick: Partially fix space only input without special tokens added to the output #3091 Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Added property is_fast on PretrainedTokenizer and PretrainedTokenizerFast Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Make fast tokenizers unittests work on Windows. * Entirely refactored unittest for tokenizers fast. * Remove ABC class for CommonFastTokenizerTest * Added embeded_special_tokens tests from allenai @dirkgr * Make embeded_special_tokens tests from allenai more generic * Uniformize vocab_size as a property for both Fast and normal tokenizers * Move special tokens handling out of PretrainedTokenizer (SpecialTokensMixin) * Ensure providing None input raise the same ValueError than Python tokenizer + tests. * Fix invalid input for assert_padding when test...
-
Ethan Perez authored
* Fix RoBERTa/XLNet Pad Token in run_multiple_choice.py `convert_examples_to_fes atures` sets `pad_token=0` by default, which is correct for BERT but incorrect for RoBERTa (`pad_token=1`) and XLNet (`pad_token=5`). I think the other arguments to `convert_examples_to_features` are correct, but it might be helpful if someone checked who is more familiar with this part of the codebase. * Simplifying change to match recent commits
-
ktrapeznikov authored
-
Manuel Romero authored
-
Manuel Romero authored
-
Manuel Romero authored
-
Manuel Romero authored
* Add model card * Fix model name in fine-tuning script
-
Manuel Romero authored
* Create model card * Fix model name in fine-tuning script
-
Manuel Romero authored
-
MichalMalyska authored
-
jjacampos authored
* Add model card for BERTeus * Update README
-
Suchin authored
* added model card * updated README * updated README * updated README * added evals * removed pico eval * Tweaks Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-