"examples/run_mmimdb.py" did not exist on "912fdff899cf0fd674ed357e46a0209311aefad2"
  1. 15 Jul, 2020 1 commit
  2. 14 Jul, 2020 2 commits
    • Sam Shleifer's avatar
    • as-stevens's avatar
      [Reformer classification head] Implement the reformer model classification... · f867000f
      as-stevens authored
      
      [Reformer classification head] Implement the reformer model classification head for text classification (#5198)
      
      * Reformer model head classification implementation for text classification
      
      * Reformat the reformer model classification code
      
      * PR review comments, and test case implementation for reformer for classification head changes
      
      * CI/CD reformer for classification head test import error fix
      
      * CI/CD test case implementation  added ReformerForSequenceClassification to all_model_classes
      
      * Code formatting- fixed
      
      * Normal test cases added for reformer classification head
      
      * Fix test cases implementation for the reformer classification head
      
      * removed token_type_id parameter from the reformer classification head
      
      * fixed the test case for reformer classification head
      
      * merge conflict with master fixed
      
      * merge conflict, changed reformer classification to accept the choice_label parameter added in latest code
      
      * refactored the the reformer classification head test code
      
      * reformer classification head, common transform test cases fixed
      
      * final set of the review comment, rearranging the reformer classes and docstring add to classification forward method
      
      * fixed the compilation error and text case fix for reformer classification head
      
      * Apply suggestions from code review
      
      Remove unnecessary dup
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      f867000f
  3. 13 Jul, 2020 2 commits
  4. 10 Jul, 2020 1 commit
    • Sylvain Gugger's avatar
      Change model outputs types to self-document outputs (#5438) · edfd82f5
      Sylvain Gugger authored
      * [WIP] Proposal for model outputs
      
      * All Bert models
      
      * Make CI green maybe?
      
      * Fix ONNX test
      
      * Isolate ModelOutput from pt and tf
      
      * Formatting
      
      * Add Electra models
      
      * Auto-generate docstrings from outputs
      
      * Add TF outputs
      
      * Add some BERT models
      
      * Revert TF side
      
      * Remove last traces of TF changes
      
      * Fail with a clear error message
      
      * Add Albert and work through Bart
      
      * Add CTRL and DistilBert
      
      * Formatting
      
      * Progress on Bart
      
      * Renames and finish Bart
      
      * Formatting
      
      * Fix last test
      
      * Add DPR
      
      * Finish Electra and add FlauBERT
      
      * Add GPT2
      
      * Add Longformer
      
      * Add MMBT
      
      * Add MobileBert
      
      * Add GPT
      
      * Formatting
      
      * Add Reformer
      
      * Add Roberta
      
      * Add T5
      
      * Add Transformer XL
      
      * Fix test
      
      * Add XLM + fix XLMForTokenClassification
      
      * Style + XLMRoberta
      
      * Add XLNet
      
      * Formatting
      
      * Add doc of return_tuple arg
      edfd82f5
  5. 08 Jul, 2020 2 commits
    • Lorenzo Ampil's avatar
      Fix Inconsistent NER Grouping (Pipeline) (#4987) · 0cc4eae0
      Lorenzo Ampil authored
      
      
      * Add B I handling to grouping
      
      * Add fix to include separate entity as last token
      
      * move last_idx definition outside loop
      
      * Use first entity in entity group as reference for entity type
      
      * Add test cases
      
      * Take out extra class accidentally added
      
      * Return tf ner grouped test to original
      
      * Take out redundant last entity
      
      * Get last_idx safely
      Co-authored-by: default avatarColleterVi <36503688+ColleterVi@users.noreply.github.com>
      
      * Fix first entity comment
      
      * Create separate functions for group_sub_entities and group_entities (splitting call method to testable functions)
      
      * Take out unnecessary last_idx
      
      * Remove additional forward pass test
      
      * Move token classification basic tests to separate class
      
      * Move token classification basic tests back to monocolumninputtestcase
      
      * Move base ner tests to nerpipelinetests
      
      * Take out unused kwargs
      
      * Add back mandatory_keys argument
      
      * Add unitary tests for group_entities in _test_ner_pipeline
      
      * Fix last entity handling
      
      * Fix grouping fucntion used
      
      * Add typing to group_sub_entities and group_entities
      Co-authored-by: default avatarColleterVi <36503688+ColleterVi@users.noreply.github.com>
      0cc4eae0
    • Patrick von Platen's avatar
      [Benchmark] Add benchmarks for TF Training (#5594) · f82a2a5e
      Patrick von Platen authored
      * tf_train
      
      * adapt timing for tpu
      
      * fix timing
      
      * fix timing
      
      * fix timing
      
      * fix timing
      
      * update notebook
      
      * add tests
      f82a2a5e
  6. 07 Jul, 2020 8 commits
    • Sam Shleifer's avatar
      Add mbart-large-cc25, support translation finetuning (#5129) · 353b8f1e
      Sam Shleifer authored
      improve unittests for finetuning, especially w.r.t testing frozen parameters
      fix freeze_embeds for T5
      add streamlit setup.cfg
      353b8f1e
    • Patrick von Platen's avatar
      [Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming... · 4dc65591
      Patrick von Platen authored
      
      [Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming and keras compile (#5395)
      
      * add first version of clm tf
      
      * make style
      
      * add more tests for bert
      
      * update tf clm loss
      
      * fix tests
      
      * correct tf ner script
      
      * add mlm loss
      
      * delete bogus file
      
      * clean tf auto model + add tests
      
      * finish adding clm loss everywhere
      
      * fix training in distilbert
      
      * fix flake8
      
      * save intermediate
      
      * fix tf t5 naming
      
      * remove prints
      
      * finish up
      
      * up
      
      * fix tf gpt2
      
      * fix new test utils import
      
      * fix flake8
      
      * keep backward compatibility
      
      * Update src/transformers/modeling_tf_albert.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/modeling_tf_auto.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/modeling_tf_electra.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/modeling_tf_roberta.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/modeling_tf_mobilebert.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/modeling_tf_auto.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/modeling_tf_bert.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/modeling_tf_distilbert.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * apply sylvains suggestions
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      4dc65591
    • Quentin Lhoest's avatar
      Fix tests imports dpr (#5576) · 4fedc125
      Quentin Lhoest authored
      * fix test imports
      
      * fix max_length
      
      * style
      
      * fix tests
      4fedc125
    • Sam Shleifer's avatar
      [Bart] enable test_torchscript, update test_tie_weights (#5457) · d4886173
      Sam Shleifer authored
      * Passing all but one torchscript test
      
      * Style
      
      * move comment
      
      * remove unneeded assert
      d4886173
    • Quentin Lhoest's avatar
      Add DPR model (#5279) · fbd87921
      Quentin Lhoest authored
      
      
      * beginning of dpr modeling
      
      * wip
      
      * implement forward
      
      * remove biencoder + better init weights
      
      * export dpr model to embed model for nlp lib
      
      * add new api
      
      * remove old code
      
      * make style
      
      * fix dumb typo
      
      * don't load bert weights
      
      * docs
      
      * docs
      
      * style
      
      * move the `k` parameter
      
      * fix init_weights
      
      * add pretrained configs
      
      * minor
      
      * update config names
      
      * style
      
      * better config
      
      * style
      
      * clean code based on PR comments
      
      * change Dpr to DPR
      
      * fix config
      
      * switch encoder config to a dict
      
      * style
      
      * inheritance -> composition
      
      * add messages in assert startements
      
      * add dpr reader tokenizer
      
      * one tokenizer per model
      
      * fix base_model_prefix
      
      * fix imports
      
      * typo
      
      * add convert script
      
      * docs
      
      * change tokenizers conf names
      
      * style
      
      * change tokenizers conf names
      
      * minor
      
      * minor
      
      * fix wrong names
      
      * minor
      
      * remove unused convert functions
      
      * rename convert script
      
      * use return_tensors in tokenizers
      
      * remove n_questions dim
      
      * move generate logic to tokenizer
      
      * style
      
      * add docs
      
      * docs
      
      * quality
      
      * docs
      
      * add tests
      
      * style
      
      * add tokenization tests
      
      * DPR full tests
      
      * Stay true to the attention mask building
      
      * update docs
      
      * missing param in bert input docs
      
      * docs
      
      * style
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      fbd87921
    • Abel's avatar
      Make T5 compatible with ONNX (#5518) · 69122657
      Abel authored
      
      
      * Default decoder inputs to encoder ones for T5 if neither are specified.
      
      * Fixing typo, now all tests are passing.
      
      * Changing einsum to operations supported by onnx
      
      * Adding a test to ensure T5 can be exported to onnx op>9
      
      * Modified test for onnx export to make it faster
      
      * Styling changes.
      
      * Styling changes.
      
      * Changing notation for matrix multiplication
      Co-authored-by: default avatarAbel Riboulot <tkai@protomail.com>
      69122657
    • Patrick von Platen's avatar
      [Reformer] Adapt Reformer MaskedLM Attn mask (#5560) · 989ae326
      Patrick von Platen authored
      * fix attention mask
      
      * fix slow test
      
      * refactor attn masks
      
      * fix fp16 generate test
      989ae326
    • Shashank Gupta's avatar
      Added data collator for permutation (XLNet) language modeling and related calls (#5522) · 3dcb748e
      Shashank Gupta authored
      * Added data collator for XLNet language modeling and related calls
      
      Added DataCollatorForXLNetLanguageModeling in data/data_collator.py
      to generate necessary inputs for language modeling training with
      XLNetLMHeadModel. Also added related arguments, logic and calls in
      examples/language-modeling/run_language_modeling.py.
      
      Resolves: #4739, #2008 (partially)
      
      * Changed name to `DataCollatorForPermutationLanguageModeling`
      
      Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`.
      Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use.
      CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of
      similar to `mems` for XLNet).
      Changed calls and imports appropriately.
      
      * Added detailed comments, changed variable names
      
      Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative.
      
      * Added tests for new data collator
      
      Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences.
      
      * Fixed styling issues
      3dcb748e
  7. 06 Jul, 2020 1 commit
    • Anthony MOI's avatar
      Various tokenizers fixes (#5558) · 5787e4c1
      Anthony MOI authored
      * BertTokenizerFast - Do not specify strip_accents by default
      
      * Bump tokenizers to new version
      
      * Add test for AddedToken serialization
      5787e4c1
  8. 03 Jul, 2020 2 commits
  9. 02 Jul, 2020 1 commit
  10. 01 Jul, 2020 8 commits
  11. 30 Jun, 2020 1 commit
  12. 29 Jun, 2020 1 commit
  13. 28 Jun, 2020 1 commit
  14. 26 Jun, 2020 4 commits
  15. 25 Jun, 2020 3 commits
  16. 24 Jun, 2020 2 commits