- 15 Nov, 2020 1 commit
-
-
Thomas Wolf authored
[breaking|pipelines|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073) * Fixing roberta for slow-fast tests * WIP getting equivalence on pipelines * slow-to-fast equivalence - working on question-answering pipeline * optional FAISS tests * Pipeline Q&A * Move pipeline tests to their own test job again * update tokenizer to add sequence id methods * update to tokenizers 0.9.4 * set sentencepiecce as optional * clean up squad * clean up pipelines to use sequence_ids * style/quality * wording * Switch to use_fast = True by default * update tests for use_fast at True by default * fix rag tokenizer test * removing protobuf from required dependencies * fix NER test for use_fast = True by default * fixing example tests (Q&A examples use slow tokenizers for now) * protobuf in main deps extras["sentencepiece"] and example deps * fix protobug install test * try to fix seq2seq by switching to slow tokenizers for now * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 12 Nov, 2020 1 commit
-
-
Julien Plu authored
-
- 23 Oct, 2020 1 commit
-
-
Ethan Perez authored
Updating the run_squad training script to handle the "longformer" `model_type`. The longformer is trained in the same was as RoBERTa, so I've added the "longformer" `model_type` (that's the right hugginface name for the LongFormer model, right?) everywhere there was a "roberta" `model_type` reference. The longformer (like RoBERTa) doesn't use `token_type_ids` (as I understand from looking at the [longformer notebook](https://github.com/patil-suraj/Notebooks/blob/master/longformer_qa_training.ipynb), which is what gets updated after this change. This fix might be related to [this issue](https://github.com/huggingface/transformers/issues/7249) with SQuAD training when using run_squad.py
-
- 15 Sep, 2020 1 commit
-
-
Stas Bekman authored
-
- 27 Aug, 2020 1 commit
-
-
Tom Grek authored
-
- 30 Jul, 2020 1 commit
-
-
Sylvain Gugger authored
* Switch from return_tuple to return_dict * Fix test * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice * Rework TF trainer (#6038) * Fully rework training/prediction loops * fix method name * Fix variable name * Fix property name * Fix scope * Fix method name * Fix tuple index * Fix tuple index * Fix indentation * Fix variable name * fix eval before log * Add drop remainder for test dataset * Fix step number + fix logging datetime * fix eval loss value * use global step instead of step + fix logging at step 0 * Fix logging datetime * Fix global_step usage * Fix breaking loop + logging datetime * Fix step in prediction loop * Fix step breaking * Fix train/test loops * Force TF at least 2.2 for the trainer * Use assert_cardinality to facilitate the dataset size computation * Log steps per epoch * Make tfds compliant with TPU * Make tfds compliant with TPU * Use TF dataset enumerate instead of the Python one * revert previous commit * Fix data_dir * Apply style * rebase on master * Address Sylvain's comments * Address Sylvain's and Lysandre comments * Trigger CI * Remove unused import * Switch from return_tuple to return_dict * Fix test * Add recent model Co-authored-by:Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Julien Plu <plu.julien@gmail.com>
-
- 20 Jul, 2020 2 commits
-
-
Qingqing Cao authored
The DataParallel training was fixed in https://github.com/huggingface/transformers/pull/5733, this commit also fixes the evaluation. It's more convenient when the user enables both `do_train` and `do_eval`.
-
Stas Bekman authored
* DataParallel fixes: 1. switched to a more precise check - if self.args.n_gpu > 1: + if isinstance(model, nn.DataParallel): 2. fix tests - require the same fixup under DataParallel as the training module * another fix
-
- 28 Jun, 2020 1 commit
-
-
Sam Shleifer authored
* all save_pretrained methods mkdir if not os.path.exists
-
- 17 Jun, 2020 1 commit
-
-
Lysandre authored
closes #4958
-
- 02 Jun, 2020 1 commit
-
-
Julien Chaumond authored
* Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI
-
- 07 May, 2020 1 commit
-
-
Julien Chaumond authored
* Created using Colaboratory * [examples] reorganize files * remove run_tpu_glue.py as superseded by TPU support in Trainer * Bugfix: int, not tuple * move files around
-
- 20 Apr, 2020 1 commit
-
-
Jared T Nielsen authored
* Add qas_id * Fix incorrect name in squad.py * Make output files optional for squad eval
-
- 24 Mar, 2020 1 commit
-
-
Julien Chaumond authored
-
- 02 Mar, 2020 1 commit
-
-
Victor SANH authored
* fix n_gpu count when no_cuda flag is activated * someone was left behind
-
- 21 Feb, 2020 1 commit
-
-
maximeilluin authored
* Added CamembertForQuestionAnswering * fixed camembert tokenizer case
-
- 04 Feb, 2020 1 commit
-
-
Yuval Pinter authored
* pass langs parameter to certain XLM models Adding an argument that specifies the language the SQuAD dataset is in so language-sensitive XLMs (e.g. `xlm-mlm-tlm-xnli15-1024`) don't default to language `0`. Allows resolution of issue #1799 . * fixing from `make style` * fixing style (again)
-
- 28 Jan, 2020 1 commit
-
-
Lysandre authored
-
- 17 Jan, 2020 1 commit
-
-
jiyeon_baek authored
Rul -> Run
-
- 16 Jan, 2020 1 commit
-
-
Lysandre authored
-
- 08 Jan, 2020 2 commits
-
-
Lysandre authored
-
Lysandre Debut authored
-
- 06 Jan, 2020 2 commits
-
-
alberduris authored
-
alberduris authored
-
- 22 Dec, 2019 5 commits
-
-
Aymeric Augustin authored
-
Aymeric Augustin authored
-
Aymeric Augustin authored
This change is mostly autogenerated with: $ python -m autoflake --in-place --recursive --remove-all-unused-imports --ignore-init-module-imports examples templates transformers utils hubconf.py setup.py I made minor changes in the generated diff. -
Aymeric Augustin authored
-
Aymeric Augustin authored
This is the result of: $ isort --recursive examples templates transformers utils hubconf.py setup.py
-
- 21 Dec, 2019 3 commits
-
-
Aymeric Augustin authored
This is the result of: $ black --line-length 119 examples templates transformers utils hubconf.py setup.py There's a lot of fairly long lines in the project. As a consequence, I'm picking the longest widely accepted line length, 119 characters. This is also Thomas' preference, because it allows for explicit variable names, to make the code easier to understand. -
thomwolf authored
-
thomwolf authored
-
- 19 Dec, 2019 1 commit
-
-
Francesco authored
Removed duplicate XLMConfig, XLMForQuestionAnswering and XLMTokenizer from import statement of run_squad.py script
-
- 16 Dec, 2019 1 commit
-
-
Lysandre authored
-
- 14 Dec, 2019 1 commit
-
-
erenup authored
-
- 13 Dec, 2019 2 commits
- 12 Dec, 2019 1 commit
-
-
LysandreJik authored
-
- 11 Dec, 2019 1 commit
-
-
Bilal Khan authored
-
- 10 Dec, 2019 1 commit
-
-
LysandreJik authored
-