1. 20 Oct, 2020 1 commit
  2. 19 Oct, 2020 1 commit
    • Weizhen's avatar
      ProphetNet (#7157) · 2422cda0
      Weizhen authored
      
      
      * add new model prophetnet
      
      prophetnet modified
      
      modify codes as suggested v1
      
      add prophetnet test files
      
      * still bugs, because of changed output formats of encoder and decoder
      
      * move prophetnet into the latest version
      
      * clean integration tests
      
      * clean tokenizers
      
      * add xlm config to init
      
      * correct typo in init
      
      * further refactoring
      
      * continue refactor
      
      * save parallel
      
      * add decoder_attention_mask
      
      * fix use_cache vs. past_key_values
      
      * fix common tests
      
      * change decoder output logits
      
      * fix xlm tests
      
      * make common tests pass
      
      * change model architecture
      
      * add tokenizer tests
      
      * finalize model structure
      
      * no weight mapping
      
      * correct n-gram stream attention mask as discussed with qweizhen
      
      * remove unused import
      
      * fix index.rst
      
      * fix tests
      
      * delete unnecessary code
      
      * add fast integration test
      
      * rename weights
      
      * final weight remapping
      
      * save intermediate
      
      * Descriptions for Prophetnet Config File
      
      * finish all models
      
      * finish new model outputs
      
      * delete unnecessary files
      
      * refactor encoder layer
      
      * add dummy docs
      
      * code quality
      
      * fix tests
      
      * add model pages to doctree
      
      * further refactor
      
      * more refactor, more tests
      
      * finish code refactor and tests
      
      * remove unnecessary files
      
      * further clean up
      
      * add docstring template
      
      * finish tokenizer doc
      
      * finish prophetnet
      
      * fix copies
      
      * fix typos
      
      * fix tf tests
      
      * fix fp16
      
      * fix tf test 2nd try
      
      * fix code quality
      
      * add test for each model
      
      * merge new tests to branch
      
      * Update model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md
      Co-authored-by: default avatarSam Shleifer <sshleifer@gmail.com>
      
      * Update model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md
      Co-authored-by: default avatarSam Shleifer <sshleifer@gmail.com>
      
      * Update src/transformers/modeling_prophetnet.py
      Co-authored-by: default avatarSam Shleifer <sshleifer@gmail.com>
      
      * Update utils/check_repo.py
      Co-authored-by: default avatarSam Shleifer <sshleifer@gmail.com>
      
      * apply sams and sylvains comments
      
      * make style
      
      * remove unnecessary code
      
      * Update README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/configuration_prophetnet.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * implement lysandres comments
      
      * correct docs
      
      * fix isort
      
      * fix tokenizers
      
      * fix copies
      Co-authored-by: default avatarweizhen <weizhen@mail.ustc.edu.cn>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSam Shleifer <sshleifer@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      2422cda0
  3. 07 Oct, 2020 1 commit
  4. 01 Oct, 2020 1 commit
  5. 08 Sep, 2020 1 commit
  6. 02 Sep, 2020 1 commit
    • Stas Bekman's avatar
      [testing] fix ambiguous test (#6898) · e71f32c0
      Stas Bekman authored
      Since `generate()` does:
      ```
              num_beams = num_beams if num_beams is not None else self.config.num_beams
      ```
      This test fails if `model.config.num_beams > 1` (which is the case in the model I'm porting).
      
      This fix makes the test setup unambiguous by passing an explicit `num_beams=1` to `generate()`.
      
      Thanks.
      e71f32c0
  7. 26 Aug, 2020 1 commit
  8. 24 Aug, 2020 1 commit
  9. 20 Aug, 2020 1 commit
  10. 19 Aug, 2020 2 commits
  11. 13 Aug, 2020 1 commit
  12. 11 Aug, 2020 1 commit
    • Pradhy729's avatar
      Feed forward chunking (#6024) · b25cec13
      Pradhy729 authored
      
      
      * Chunked feed forward for Bert
      
      This is an initial implementation to test applying feed forward chunking for BERT.
      Will need additional modifications based on output and benchmark results.
      
      * Black and cleanup
      
      * Feed forward chunking in BertLayer class.
      
      * Isort
      
      * add chunking for all models
      
      * fix docs
      
      * Fix typo
      Co-authored-by: default avatarpatrickvonplaten <patrick.v.platen@gmail.com>
      b25cec13
  13. 31 Jul, 2020 1 commit
  14. 30 Jul, 2020 1 commit
    • Sylvain Gugger's avatar
      Switch from return_tuple to return_dict (#6138) · 91cb9546
      Sylvain Gugger authored
      
      
      * Switch from return_tuple to return_dict
      
      * Fix test
      
      * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614)
      
      * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests
      
      * AutoModels
      
      
      Tiny tweaks
      
      * Style
      
      * Final changes before merge
      
      * Re-order for simpler review
      
      * Final fixes
      
      * Addressing @sgugger's comments
      
      * Test MultipleChoice
      
      * Rework TF trainer (#6038)
      
      * Fully rework training/prediction loops
      
      * fix method name
      
      * Fix variable name
      
      * Fix property name
      
      * Fix scope
      
      * Fix method name
      
      * Fix tuple index
      
      * Fix tuple index
      
      * Fix indentation
      
      * Fix variable name
      
      * fix eval before log
      
      * Add drop remainder for test dataset
      
      * Fix step number + fix logging datetime
      
      * fix eval loss value
      
      * use global step instead of step + fix logging at step 0
      
      * Fix logging datetime
      
      * Fix global_step usage
      
      * Fix breaking loop + logging datetime
      
      * Fix step in prediction loop
      
      * Fix step breaking
      
      * Fix train/test loops
      
      * Force TF at least 2.2 for the trainer
      
      * Use assert_cardinality to facilitate the dataset size computation
      
      * Log steps per epoch
      
      * Make tfds compliant with TPU
      
      * Make tfds compliant with TPU
      
      * Use TF dataset enumerate instead of the Python one
      
      * revert previous commit
      
      * Fix data_dir
      
      * Apply style
      
      * rebase on master
      
      * Address Sylvain's comments
      
      * Address Sylvain's and Lysandre comments
      
      * Trigger CI
      
      * Remove unused import
      
      * Switch from return_tuple to return_dict
      
      * Fix test
      
      * Add recent model
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarJulien Plu <plu.julien@gmail.com>
      91cb9546
  15. 29 Jul, 2020 1 commit
  16. 20 Jul, 2020 1 commit
    • Stas Bekman's avatar
      DataParallel fixes (#5733) · 35cb101e
      Stas Bekman authored
      * DataParallel fixes:
      
      1. switched to a more precise check
      -        if self.args.n_gpu > 1:
      +        if isinstance(model, nn.DataParallel):
      
      2. fix tests - require the same fixup under DataParallel as the training module
      
      * another fix
      35cb101e
  17. 10 Jul, 2020 1 commit
    • Sylvain Gugger's avatar
      Change model outputs types to self-document outputs (#5438) · edfd82f5
      Sylvain Gugger authored
      * [WIP] Proposal for model outputs
      
      * All Bert models
      
      * Make CI green maybe?
      
      * Fix ONNX test
      
      * Isolate ModelOutput from pt and tf
      
      * Formatting
      
      * Add Electra models
      
      * Auto-generate docstrings from outputs
      
      * Add TF outputs
      
      * Add some BERT models
      
      * Revert TF side
      
      * Remove last traces of TF changes
      
      * Fail with a clear error message
      
      * Add Albert and work through Bart
      
      * Add CTRL and DistilBert
      
      * Formatting
      
      * Progress on Bart
      
      * Renames and finish Bart
      
      * Formatting
      
      * Fix last test
      
      * Add DPR
      
      * Finish Electra and add FlauBERT
      
      * Add GPT2
      
      * Add Longformer
      
      * Add MMBT
      
      * Add MobileBert
      
      * Add GPT
      
      * Formatting
      
      * Add Reformer
      
      * Add Roberta
      
      * Add T5
      
      * Add Transformer XL
      
      * Fix test
      
      * Add XLM + fix XLMForTokenClassification
      
      * Style + XLMRoberta
      
      * Add XLNet
      
      * Formatting
      
      * Add doc of return_tuple arg
      edfd82f5
  18. 07 Jul, 2020 1 commit
  19. 01 Jul, 2020 2 commits
  20. 25 Jun, 2020 1 commit
    • Thomas Wolf's avatar
      [Tokenization] Fix #5181 - make #5155 more explicit - move back the default... · 27cf1d97
      Thomas Wolf authored
      [Tokenization] Fix #5181 - make #5155 more explicit - move back the default logging level in tests to WARNING (#5252)
      
      * fix-5181
      
      Padding to max sequence length while truncation to another length was wrong on slow tokenizers
      
      * clean up and fix #5155
      
      * fix XLM test
      
      * Fix tests for Transfo-XL
      
      * logging only above WARNING in tests
      
      * switch slow tokenizers tests in @slow
      
      * fix Marian truncation tokenization test
      
      * style and quality
      
      * make the test a lot faster by limiting the sequence length used in tests
      27cf1d97
  21. 22 Jun, 2020 1 commit
    • Joseph Liu's avatar
      Output hidden states (#4978) · f4e1f022
      Joseph Liu authored
      
      
      * Configure all models to use output_hidden_states as argument passed to foward()
      
      * Pass all tests
      
      * Remove cast_bool_to_primitive in TF Flaubert model
      
      * correct tf xlnet
      
      * add pytorch test
      
      * add tf test
      
      * Fix broken tests
      
      * Configure all models to use output_hidden_states as argument passed to foward()
      
      * Pass all tests
      
      * Remove cast_bool_to_primitive in TF Flaubert model
      
      * correct tf xlnet
      
      * add pytorch test
      
      * add tf test
      
      * Fix broken tests
      
      * Refactor output_hidden_states for mobilebert
      
      * Reset and remerge to master
      Co-authored-by: default avatarJoseph Liu <joseph.liu@coinflex.com>
      Co-authored-by: default avatarpatrickvonplaten <patrick.v.platen@gmail.com>
      f4e1f022
  22. 15 Jun, 2020 1 commit
  23. 10 Jun, 2020 3 commits
  24. 09 Jun, 2020 1 commit
    • Bharat Raghunathan's avatar
      [All models] Extend config.output_attentions with output_attentions function arguments (#4538) · 6e603cb7
      Bharat Raghunathan authored
      
      
      * DOC: Replace instances of ``config.output_attentions`` with function argument ``output_attentions``
      
      * DOC: Apply Black Formatting
      
      * Fix errors where output_attentions was undefined
      
      * Remove output_attentions in classes per review
      
      * Fix regressions on tests having `output_attention`
      
      * Fix further regressions in tests relating to `output_attentions`
      
      Ensure proper propagation of `output_attentions` as a function parameter
      to all model subclasses
      
      * Fix more regressions in `test_output_attentions`
      
      * Fix issues with BertEncoder
      
      * Rename related variables to `output_attentions`
      
      * fix pytorch tests
      
      * fix bert and gpt2 tf
      
      * Fix most TF tests for `test_output_attentions`
      
      * Fix linter errors and more TF tests
      
      * fix conflicts
      
      * DOC: Apply Black Formatting
      
      * Fix errors where output_attentions was undefined
      
      * Remove output_attentions in classes per review
      
      * Fix regressions on tests having `output_attention`
      
      * fix conflicts
      
      * fix conflicts
      
      * fix conflicts
      
      * fix conflicts
      
      * fix pytorch tests
      
      * fix conflicts
      
      * fix conflicts
      
      * Fix linter errors and more TF tests
      
      * fix tf tests
      
      * make style
      
      * fix isort
      
      * improve output_attentions
      
      * improve tensorflow
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      6e603cb7
  25. 02 Jun, 2020 1 commit
    • Julien Chaumond's avatar
      Kill model archive maps (#4636) · d4c2cb40
      Julien Chaumond authored
      * Kill model archive maps
      
      * Fixup
      
      * Also kill model_archive_map for MaskedBertPreTrainedModel
      
      * Unhook config_archive_map
      
      * Tokenizers: align with model id changes
      
      * make style && make quality
      
      * Fix CI
      d4c2cb40
  26. 19 May, 2020 1 commit
    • Julien Chaumond's avatar
      Fix nn.DataParallel compatibility in PyTorch 1.5 (#4300) · 4c068936
      Julien Chaumond authored
      * Test case for #3936
      
      * multigpu tests pass on pytorch 1.4.0
      
      * Fixup
      
      * multigpu tests pass on pytorch 1.5.0
      
      * Update src/transformers/modeling_utils.py
      
      * Update src/transformers/modeling_utils.py
      
      * rename multigpu to require_multigpu
      
      * mode doc
      4c068936
  27. 07 May, 2020 1 commit
    • Patrick von Platen's avatar
      Reformer (#3351) · dca34695
      Patrick von Platen authored
      * first copy & past commit from Bert and morgans LSH code
      
      * add easy way to compare to trax original code
      
      * translate most of function
      
      * make trax lsh self attention deterministic with numpy seed + copy paste code
      
      * add same config
      
      * add same config
      
      * make layer init work
      
      * implemented hash_vectors function for lsh attention
      
      * continue reformer translation
      
      * hf LSHSelfAttentionLayer gives same output as trax layer
      
      * refactor code
      
      * refactor code
      
      * refactor code
      
      * refactor
      
      * refactor + add reformer config
      
      * delete bogus file
      
      * split reformer attention layer into two layers
      
      * save intermediate step
      
      * save intermediate step
      
      * make test work
      
      * add complete reformer block layer
      
      * finish reformer layer
      
      * implement causal and self mask
      
      * clean reformer test and refactor code
      
      * fix merge conflicts
      
      * fix merge conflicts
      
      * update init
      
      * fix device for GPU
      
      * fix chunk length init for tests
      
      * include morgans optimization
      
      * improve memory a bit
      
      * improve comment
      
      * factorize num_buckets
      
      * better testing parameters
      
      * make whole model work
      
      * make lm model work
      
      * add t5 copy paste tokenizer
      
      * add chunking feed forward
      
      * clean config
      
      * add improved assert statements
      
      * make tokenizer work
      
      * improve test
      
      * correct typo
      
      * extend config
      
      * add complexer test
      
      * add new axial position embeddings
      
      * add local block attention layer
      
      * clean tests
      
      * refactor
      
      * better testing
      
      * save intermediate progress
      
      * clean test file
      
      * make shorter input length work for model
      
      * allow variable input length
      
      * refactor
      
      * make forward pass for pretrained model work
      
      * add generation possibility
      
      * finish dropout and init
      
      * make style
      
      * refactor
      
      * add first version of RevNet Layers
      
      * make forward pass work and add convert file
      
      * make uploaded model forward pass work
      
      * make uploaded model forward pass work
      
      * refactor code
      
      * add namedtuples and cache buckets
      
      * correct head masks
      
      * refactor
      
      * made reformer more flexible
      
      * make style
      
      * remove set max length
      
      * add attention masks
      
      * fix up tests
      
      * fix lsh attention mask
      
      * make random seed optional for the moment
      
      * improve memory in reformer
      
      * add tests
      
      * make style
      
      * make sure masks work correctly
      
      * detach gradients
      
      * save intermediate
      
      * correct backprob through gather
      
      * make style
      
      * change back num hashes
      
      * rename to labels
      
      * fix rotation shape
      
      * fix detach
      
      * update
      
      * fix trainer
      
      * fix backward dropout
      
      * make reformer more flexible
      
      * fix conflict
      
      * fix
      
      * fix
      
      * add tests for fixed seed in reformer layer
      
      * fix trainer typo
      
      * fix typo in activations
      
      * add fp16 tests
      
      * add fp16 training
      
      * support fp16
      
      * correct gradient bug in reformer
      
      * add fast gelu
      
      * re-add dropout for embedding dropout
      
      * better naming
      
      * better naming
      
      * renaming
      
      * finalize test branch
      
      * finalize tests
      
      * add more tests
      
      * finish tests
      
      * fix
      
      * fix type trainer
      
      * fix fp16 tests
      
      * fix tests
      
      * fix tests
      
      * fix tests
      
      * fix issue with dropout
      
      * fix dropout seeds
      
      * correct random seed on gpu
      
      * finalize random seed for dropout
      
      * finalize random seed for dropout
      
      * remove duplicate line
      
      * correct half precision bug
      
      * make style
      
      * refactor
      
      * refactor
      
      * docstring
      
      * remove sinusoidal position encodings for reformer
      
      * move chunking to modeling_utils
      
      * make style
      
      * clean config
      
      * make style
      
      * fix tests
      
      * fix auto tests
      
      * pretrained models
      
      * fix docstring
      
      * update conversion file
      
      * Update pretrained_models.rst
      
      * fix rst
      
      * fix rst
      
      * update copyright
      
      * fix test path
      
      * fix test path
      
      * fix small issue in test
      
      * include reformer in generation tests
      
      * add docs for axial position encoding
      
      * finish docs
      
      * Update convert_reformer_trax_checkpoint_to_pytorch.py
      
      * remove isort
      
      * include sams comments
      
      * remove wrong comment in utils
      
      * correct typos
      
      * fix typo
      
      * Update reformer.rst
      
      * applied morgans optimization
      
      * make style
      
      * make gpu compatible
      
      * remove bogus file
      
      * big test refactor
      
      * add example for chunking
      
      * fix typo
      
      * add to README
      dca34695
  28. 05 May, 2020 1 commit
    • Lysandre Debut's avatar
      Pytorch 1.5.0 (#3973) · 79b1c696
      Lysandre Debut authored
      * Standard deviation can no longer be set to 0
      
      * Remove torch pinned version
      
      * 9th instead of 10th, silly me
      79b1c696
  29. 29 Apr, 2020 1 commit
  30. 14 Apr, 2020 1 commit
  31. 09 Apr, 2020 1 commit
  32. 06 Apr, 2020 1 commit
  33. 31 Mar, 2020 1 commit
  34. 26 Mar, 2020 1 commit
  35. 19 Mar, 2020 1 commit
    • Patrick von Platen's avatar
      Support T5 Generation (#3228) · bbf26c4e
      Patrick von Platen authored
      
      
      * fix conflicts
      
      * update bart max length test
      
      * correct spelling mistakes
      
      * implemented model specific encode function
      
      * fix merge conflicts
      
      * better naming
      
      * save intermediate state -> need to rethink strucuture a bit
      
      * leave tf problem as it is for now
      
      * current version
      
      * add layers.pop
      
      * remove ipdb
      
      * make style
      
      * clean return cut decoding
      
      * remove ipdbs
      
      * Fix restoring layers in the decoders that doesnt exists.
      
      * push good intermediate solution for now
      
      * fix conflicts
      
      * always good to refuse to merge conflicts when rebasing
      
      * fix small bug
      
      * improve function calls
      
      * remove unused file
      
      * add correct scope behavior for t5_generate
      Co-authored-by: default avatarMorgan Funtowicz <funtowiczmo@gmail.com>
      bbf26c4e
  36. 17 Mar, 2020 1 commit