- 27 May, 2020 1 commit
-
-
Sam Shleifer authored
-
- 25 May, 2020 2 commits
-
-
Sam Shleifer authored
-
Suraj Patil authored
* added LongformerForQuestionAnswering * add LongformerForQuestionAnswering * fix import for LongformerForMaskedLM * add LongformerForQuestionAnswering * hardcoded sep_token_id * compute attention_mask if not provided * combine global_attention_mask with attention_mask when provided * update example in docstring * add assert error messages, better attention combine * add test for longformerForQuestionAnswering * typo * cast gloabl_attention_mask to long * make style * Update src/transformers/configuration_longformer.py * Update src/transformers/configuration_longformer.py * fix the code quality * Merge branch 'longformer-for-question-answering' of https://github.com/patil-suraj/transformers into longformer-for-question-answering Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 22 May, 2020 2 commits
-
-
Anthony MOI authored
-
Frankie Liuzzi authored
* added functionality for electra classification head * unneeded dropout * Test ELECTRA for sequence classification * Style Co-authored-by:
Frankie <frankie@frase.io> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
- 21 May, 2020 1 commit
-
-
Zhangyx authored
Adds predict stage for glue tasks, and generate result files which can be submitted to gluebenchmark.com (#4463) * Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website. * Use Split enum + always output the label name Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
- 20 May, 2020 3 commits
-
-
Julien Chaumond authored
-
Julien Chaumond authored
-
Lysandre Debut authored
* There is one missing key in BERT * Correct device for CamemBERT model * RoBERTa tokenization adding prefix space * Style
-
- 19 May, 2020 7 commits
-
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Patrick von Platen authored
* fix gpu slow tests in pytorch * change model to device syntax
-
Sam Shleifer authored
-
Iz Beltagy authored
* first commit * bug fixes * better examples * undo padding * remove wrong VOCAB_FILES_NAMES * License * make style * make isort happy * unit tests * integration test * make `black` happy by undoing `isort` changes!! * lint * no need for the padding value * batch_size not bsz * remove unused type casting * seqlen not seq_len * staticmethod * `bert` selfattention instead of `n2` * uint8 instead of bool + lints * pad inputs_embeds using embeddings not a constant * black * unit test with padding * fix unit tests * remove redundant unit test * upload model weights * resolve todo * simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_ * increase unittest coverage
-
Julien Chaumond authored
* Distributed eval: SequentialDistributedSampler + gather all results * For consistency only write to disk from world_master Close https://github.com/huggingface/transformers/issues/4272 * Working distributed eval * Hook into scripts * Fix #3721 again * TPU.mesh_reduce: stay in tensor space Thanks @jysohn23 * Just a small comment * whitespace * torch.hub: pip install packaging * Add test scenarii
-
Julien Chaumond authored
* Test case for #3936 * multigpu tests pass on pytorch 1.4.0 * Fixup * multigpu tests pass on pytorch 1.5.0 * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py * rename multigpu to require_multigpu * mode doc
-
- 18 May, 2020 3 commits
-
-
Sam Shleifer authored
-
Patrick von Platen authored
* fix fp16 in t5 * make style * refactor invert_attention_mask fn * fix typo
-
Funtowicz Morgan authored
-
- 17 May, 2020 1 commit
-
-
Lorenzo Ampil authored
* Add index to be returned by NerPipeline to allow for the creation of * Add entity groups * Convert entity list to dict * Add entity to entity_group_disagg atfter updating entity gorups * Change 'group' parameter to 'grouped_entities' * Add unit tests for grouped NER pipeline case * Correct variable name typo for NER_FINETUNED_MODELS * Sync grouped tests to recent test updates
-
- 14 May, 2020 3 commits
-
-
Funtowicz Morgan authored
* Added generic ONNX conversion script for PyTorch model. * WIP initial TF support. * TensorFlow/Keras ONNX export working. * Print framework version info * Add possibility to check the model is correctly loading on ONNX runtime. * Remove quantization option. * Specify ONNX opset version when exporting. * Formatting. * Remove unused imports. * Make functions more generally reusable from other part of the code. * isort happy. * flake happy * Export only feature-extraction for now * Correctly check inputs order / filter before export. * Removed task variable * Fix invalid args call in load_graph_from_args. * Fix invalid args call in convert. * Fix invalid args call in infer_shapes. * Raise exception and catch in caller function instead of exit. * Add 04-onnx-export.ipynb notebook * More WIP on the notebook * Remove unused imports * Simplify & remove unused constants. * Export with constant_folding in PyTorch * Let's try to put function args in the right order this time ... * Disable external_data_format temporary * ONNX notebook draft ready. * Updated notebooks charts + wording * Correct error while exporting last chart in notebook. * Adressing @LysandreJik comment. * Set ONNX opset to 11 as default value. * Set opset param mandatory * Added ONNX export unittests * Quality. * flake8 happy * Add keras2onnx dependency on extras["tf"] * Pin keras2onnx on github master to v1.6.5 * Second attempt. * Third attempt. * Use the right repo URL this time ... * Do the same for onnxconverter-common * Added keras2onnx and onnxconveter-common to 1.7.0 to supports TF2.2 * Correct commit hash. * Addressing PR review: Optimization are enabled by default. * Addressing PR review: small changes in the notebook * setup.py comment about keras2onnx versioning.
-
Sam Shleifer authored
covers torch and tf. Also fixes a failing @slow test
-
Julien Chaumond authored
* Fix: unpin flake8 and fix cs errors * Ok we still need to quote those
-
- 13 May, 2020 2 commits
-
-
Sam Shleifer authored
[Marian Fixes] prevent predicting pad_token_id before softmax, support language codes, name multilingual models (#4290)
-
Julien Chaumond authored
* Improvements to the wandb integration * small reorg + no global necessary * feat(trainer): log epoch and final metrics * Simplify logging a bit * Fixup * Fix crash when just running eval Co-authored-by:
Chris Van Pelt <vanpelt@gmail.com> Co-authored-by:
Boris Dayma <boris.dayma@gmail.com>
-
- 12 May, 2020 1 commit
-
-
Julien Chaumond authored
-
- 10 May, 2020 1 commit
-
-
Sam Shleifer authored
- MarianSentencepieceTokenizer - > MarianTokenizer - Start using unk token. - add docs page - add better generation params to MarianConfig - more conversion utilities
-
- 08 May, 2020 1 commit
-
-
Patrick von Platen authored
* fix PR * move tests to correct place
-
- 07 May, 2020 4 commits
-
-
Jared T Nielsen authored
* Add AlbertForPreTraining and TFAlbertForPreTraining models. * PyTorch conversion * TensorFlow conversion * style Co-authored-by:Lysandre <lysandre.debut@reseau.eseo.fr>
-
Julien Chaumond authored
* Created using Colaboratory * [examples] reorganize files * remove run_tpu_glue.py as superseded by TPU support in Trainer * Bugfix: int, not tuple * move files around
-
Funtowicz Morgan authored
* Rewritten batch support in pipelines. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Fix imports sorting
馃敡 Signed-off-by:Morgan Funtowicz <morgan@huggingface.co> * Set pad_to_max_length=True by default on Pipeline. * Set pad_to_max_length=False for generation pipelines. Most of generation models doesn't have padding token. * Address @joeddav review comment: Uniformized *args. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Address @joeddav review comment: Uniformized *args (second). Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co>
-
Patrick von Platen authored
* first copy & past commit from Bert and morgans LSH code * add easy way to compare to trax original code * translate most of function * make trax lsh self attention deterministic with numpy seed + copy paste code * add same config * add same config * make layer init work * implemented hash_vectors function for lsh attention * continue reformer translation * hf LSHSelfAttentionLayer gives same output as trax layer * refactor code * refactor code * refactor code * refactor * refactor + add reformer config * delete bogus file * split reformer attention layer into two layers * save intermediate step * save intermediate step * make test work * add complete reformer block layer * finish reformer layer * implement causal and self mask * clean reformer test and refactor code * fix merge conflicts * fix merge conflicts * update init * fix device for GPU * fix chunk length init for tests * include morgans optimization * improve memory a bit * improve comment * factorize num_buckets * better testing parameters * make whole model work * make lm model work * add t5 copy paste tokenizer * add chunking feed forward * clean config * add improved assert statements * make tokenizer work * improve test * correct typo * extend config * add complexer test * add new axial position embeddings * add local block attention layer * clean tests * refactor * better testing * save intermediate progress * clean test file * make shorter input length work for model * allow variable input length * refactor * make forward pass for pretrained model work * add generation possibility * finish dropout and init * make style * refactor * add first version of RevNet Layers * make forward pass work and add convert file * make uploaded model forward pass work * make uploaded model forward pass work * refactor code * add namedtuples and cache buckets * correct head masks * refactor * made reformer more flexible * make style * remove set max length * add attention masks * fix up tests * fix lsh attention mask * make random seed optional for the moment * improve memory in reformer * add tests * make style * make sure masks work correctly * detach gradients * save intermediate * correct backprob through gather * make style * change back num hashes * rename to labels * fix rotation shape * fix detach * update * fix trainer * fix backward dropout * make reformer more flexible * fix conflict * fix * fix * add tests for fixed seed in reformer layer * fix trainer typo * fix typo in activations * add fp16 tests * add fp16 training * support fp16 * correct gradient bug in reformer * add fast gelu * re-add dropout for embedding dropout * better naming * better naming * renaming * finalize test branch * finalize tests * add more tests * finish tests * fix * fix type trainer * fix fp16 tests * fix tests * fix tests * fix tests * fix issue with dropout * fix dropout seeds * correct random seed on gpu * finalize random seed for dropout * finalize random seed for dropout * remove duplicate line * correct half precision bug * make style * refactor * refactor * docstring * remove sinusoidal position encodings for reformer * move chunking to modeling_utils * make style * clean config * make style * fix tests * fix auto tests * pretrained models * fix docstring * update conversion file * Update pretrained_models.rst * fix rst * fix rst * update copyright * fix test path * fix test path * fix small issue in test * include reformer in generation tests * add docs for axial position encoding * finish docs * Update convert_reformer_trax_checkpoint_to_pytorch.py * remove isort * include sams comments * remove wrong comment in utils * correct typos * fix typo * Update reformer.rst * applied morgans optimization * make style * make gpu compatible * remove bogus file * big test refactor * add example for chunking * fix typo * add to README
-
- 06 May, 2020 1 commit
-
-
Julien Plu authored
* First commit to add a TF version of the trainer. * Make the TF trainer closer to what looks the PT trainer * Refactoring common code between the PT and TF trainer into an util file. * Some bugfix + better similarity with the PT trainer * Add missing class in transformers init * Bugfix over prediction + use classification report instead of simple metrics * Fix name error * Fix optimization tests + style * Apply style * Several bugfix for multi-gpu training * Apply style * Apply style * Add glue example for the TF trainer * Several bugix + address the reviews * Fix on the TF training args file * Add a debug mode * Bugfix in utils_ner.py when segment_ids is None * Apply style * Apply style * Add TPU strategy * Fix selection strategy
-
- 05 May, 2020 1 commit
-
-
Lysandre Debut authored
* Standard deviation can no longer be set to 0 * Remove torch pinned version * 9th instead of 10th, silly me
-
- 04 May, 2020 1 commit
-
-
Patrick von Platen authored
* Hoist bert model tester for patric * indent * make tests work * Update tests/test_modeling_bert.py Co-authored-by:
Julien Chaumond <chaumond@gmail.com> Co-authored-by:
sshleifer <sshleifer@gmail.com> Co-authored-by:
Julien Chaumond <chaumond@gmail.com>
-
- 01 May, 2020 3 commits
-
-
Sam Shleifer authored
-
Julien Chaumond authored
-
Julien Chaumond authored
There's an inconsistency right now where: - we load some models into CACHE_DIR - and some models in the default cache - and often, in both for the same models When running the RUN_SLOW tests, this takes a lot of disk space, time, and bandwidth. I'd rather always use the default cache
-
- 30 Apr, 2020 2 commits
-
-
Julien Chaumond authored
-
Julien Chaumond authored
-