- 17 Jun, 2020 1 commit
-
-
Sylvain Gugger authored
* Make default_data_collator more flexible * Accept tensors for all features * Document code * Refactor * Formatting
-
- 16 Jun, 2020 3 commits
-
-
Sam Shleifer authored
-
Amil Khare authored
Co-authored-by:Sam Shleifer <sshleifer@gmail.com>
-
Funtowicz Morgan authored
* Added is_fast property on BatchEncoding to indicate if the object comes from a Fast Tokenizer. * Added __get_state__() & __set_state__() to be pickable. * Correct tokens() return type from List[int] to List[str] * Added unittest for BatchEncoding pickle/unpickle * Added unittest for BatchEncoding is_fast * More careful checking on BatchEncoding unpickle tests. * Formatting. * is_fast should assertTrue on Rust tokenizers. * Ensure tensorflow has correct way of checking array_equal * More formatting.
-
- 15 Jun, 2020 5 commits
-
-
Sylvain Gugger authored
* Add `DistilBertForMultipleChoice`
-
Anthony MOI authored
[HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510) * Use tokenizers pre-tokenized pipeline * failing pretrokenized test * Fix is_pretokenized in python * add pretokenized tests * style and quality * better tests for batched pretokenized inputs * tokenizers clean up - new padding_strategy - split the files * [HUGE] refactoring tokenizers - padding - truncation - tests * style and quality * bump up requied tokenizers version to 0.8.0-rc1 * switched padding/truncation API - simpler better backward compat * updating tests for custom tokenizers * style and quality - tests on pad * fix QA pipeline * fix backward compatibility for max_length only * style and quality * Various cleans up - add verbose * fix tests * update docstrings * Fix tests * Docs reformatted * __call__ method documented Co-authored-by:
Thomas Wolf <thomwolf@users.noreply.github.com> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Patrick von Platen authored
* fix test * Update tests/test_modeling_common.py * Update tests/test_modeling_common.py
-
Sam Shleifer authored
-
Sylvain Gugger authored
* Make DataCollator a callable * Update src/transformers/data/data_collator.py Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
- 12 Jun, 2020 4 commits
-
-
Suraj Patil authored
-
Sylvain Gugger authored
* Add AlbertForMultipleChoice * Make up to date and add all models to common tests
-
Patrick von Platen authored
* first commit * add new auto models * better naming * fix bert automodel * fix automodel for pretraining * add models to init * fix name typo * fix typo * better naming * future warning instead of depreciation warning
-
Sam Shleifer authored
-
- 11 Jun, 2020 2 commits
-
-
Sam Shleifer authored
-
Sylvain Gugger authored
* Support multiple choice in tf common model tests * Add the input_embeds test
-
- 10 Jun, 2020 8 commits
-
-
RafaelWO authored
* Fixed resize_token_embeddings for transfo_xl model * Fixed resize_token_embeddings for transfo_xl. Added custom methods to TransfoXLPreTrainedModel for resizing layers of the AdaptiveEmbedding. * Updated docstring * Fixed resizinhg cutoffs; added check for new size of embedding layer. * Added test for resize_token_embeddings * Fixed code quality * Fixed unchanged cutoffs in model.config Co-authored-by:Rafael Weingartner <rweingartner.its-b2015@fh-salzburg.ac.at>
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Split LMBert model in two * Fix example * Remove lm_labels * Adapt tests, refactor prepare_for_generation * Fix merge * Hide BeartLMHeadModel
-
Suraj Patil authored
* ElectraForQuestionAnswering * udate __init__ * add test for electra qa model * add ElectraForQuestionAnswering in auto models * add ElectraForQuestionAnswering in all_model_classes * fix outputs, input_ids defaults to None * add ElectraForQuestionAnswering in docs * remove commented line
-
Amil Khare authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Fix CI
-
Sylvain Gugger authored
* Deal with multiple choice in common tests
-
- 09 Jun, 2020 2 commits
-
-
Bharat Raghunathan authored
* DOC: Replace instances of ``config.output_attentions`` with function argument ``output_attentions`` * DOC: Apply Black Formatting * Fix errors where output_attentions was undefined * Remove output_attentions in classes per review * Fix regressions on tests having `output_attention` * Fix further regressions in tests relating to `output_attentions` Ensure proper propagation of `output_attentions` as a function parameter to all model subclasses * Fix more regressions in `test_output_attentions` * Fix issues with BertEncoder * Rename related variables to `output_attentions` * fix pytorch tests * fix bert and gpt2 tf * Fix most TF tests for `test_output_attentions` * Fix linter errors and more TF tests * fix conflicts * DOC: Apply Black Formatting * Fix errors where output_attentions was undefined * Remove output_attentions in classes per review * Fix regressions on tests having `output_attention` * fix conflicts * fix conflicts * fix conflicts * fix conflicts * fix pytorch tests * fix conflicts * fix conflicts * Fix linter errors and more TF tests * fix tf tests * make style * fix isort * improve output_attentions * improve tensorflow Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
* add tpu and torchscipt for benchmark * fix name in tests * "fix email" * make style * better log message for tpu * add more print and info for tpu * allow possibility to print tpu metrics * correct cpu usage * fix test for non-install * remove bugus file * include psutil in testing * run a couple of times before tracing in torchscript * do not allow tpu memory tracing for now * make style * add torchscript to env * better name for torch tpu Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
- 08 Jun, 2020 1 commit
-
-
Patrick von Platen authored
-
- 06 Jun, 2020 1 commit
-
-
Sam Shleifer authored
-
- 05 Jun, 2020 4 commits
-
-
Sam Shleifer authored
-
Patrick von Platen authored
* automatically set decoder config to decoder * add more tests
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Fix argument label * Fix test
-
- 04 Jun, 2020 2 commits
-
-
Julien Plu authored
* Better None gradients handling * Apply Style * Apply Style * Create a loss class per task to compute its respective loss * Add loss classes to the ALBERT TF models * Add loss classes to the BERT TF models * Add question answering and multiple choice to TF Camembert * Remove prints * Add multiple choice model to TF DistilBERT + loss computation * Add question answering model to TF Electra + loss computation * Add token classification, question answering and multiple choice models to TF Flaubert * Add multiple choice model to TF Roberta + loss computation * Add multiple choice model to TF XLM + loss computation * Add multiple choice and question answering models to TF XLM-Roberta * Add multiple choice model to TF XLNet + loss computation * Remove unused parameters * Add task loss classes * Reorder TF imports + add new model classes * Add new model classes * Bugfix in TF T5 model * Bugfix for TF T5 tests * Bugfix in TF T5 model * Fix TF T5 model tests * Fix T5 tests + some renaming * Fix inheritance issue in the AutoX tests * Add tests for TF Flaubert and TF XLM Roberta * Add tests for TF Flaubert and TF XLM Roberta * Remove unused piece of code in the TF trainer * bugfix and remove unused code * Bugfix for TF 2.2 * Apply Style * Divide TFSequenceClassificationAndMultipleChoiceLoss into their two respective name * Apply style * Mirror the PT Trainer in the TF one: fp16, optimizers and tb_writer as class parameter and better dataset handling * Fix TF optimizations tests and apply style * Remove useless parameter * Bugfix and apply style * Fix TF Trainer prediction * Now the TF models return the loss such as their PyTorch couterparts * Apply Style * Ignore some tests output * Take into account the SQuAD cls_index, p_mask and is_impossible parameters for the QuestionAnswering task models. * Fix names for SQuAD data * Apply Style * Fix conflicts with 2.11 release * Fix conflicts with 2.11 * Fix wrongname * Add better documentation on the new create_optimizer function * Fix isort * logging_dir: use same default as PyTorch Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Funtowicz Morgan authored
* Refactor tensor creation in tokenizers. * Make sure to convert string to TensorType * Refactor convert_to_tensors_ * Introduce numpy tensor creation * Format * Add unittest for TensorType creation from str * sorting imports * Added unittests for numpy tensor conversion. * Do not use in-place version for squeeze as numpy doesn't provide such feature. * Added extra parameter prepend_batch_axis: bool on prepare_for_model. * Ensure test_np_encode_plus_sent_to_model is not executed if encoder/decoder model. * style. * numpy tests require_torch for now while flax not merged. * Hopefully will make flake8 happy. * One more time
馃幎
-
- 03 Jun, 2020 1 commit
-
-
Sylvain Gugger authored
* Deprecate masked_lm_labels argument * Apply to all models * Better error message
-
- 02 Jun, 2020 4 commits
-
-
Patrick von Platen authored
* improve handling of short inputs for reformer * correct typo in assert statement * fix other tests
-
Sam Shleifer authored
-
Julien Chaumond authored
*
馃悰 Fix model ids for BART and Flaubert -
Julien Chaumond authored
* Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI
-
- 01 Jun, 2020 1 commit
-
-
Rens authored
* pass on tokenizer to pipeline * order input names when convert to onnx * update style * remove unused imports * make ordered inputs list needs to be mutable * add test custom bert model * remove unused imports
-
- 29 May, 2020 1 commit
-
-
Patrick von Platen authored
* fix bug * add more tests
-