1. 17 Jul, 2020 1 commit
    • Teven's avatar
      XLNet `use_cache` refactor (#5770) · 0b2da0e5
      Teven authored
      Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.
      0b2da0e5
  2. 10 Jul, 2020 1 commit
    • Sylvain Gugger's avatar
      Change model outputs types to self-document outputs (#5438) · edfd82f5
      Sylvain Gugger authored
      * [WIP] Proposal for model outputs
      
      * All Bert models
      
      * Make CI green maybe?
      
      * Fix ONNX test
      
      * Isolate ModelOutput from pt and tf
      
      * Formatting
      
      * Add Electra models
      
      * Auto-generate docstrings from outputs
      
      * Add TF outputs
      
      * Add some BERT models
      
      * Revert TF side
      
      * Remove last traces of TF changes
      
      * Fail with a clear error message
      
      * Add Albert and work through Bart
      
      * Add CTRL and DistilBert
      
      * Formatting
      
      * Progress on Bart
      
      * Renames and finish Bart
      
      * Formatting
      
      * Fix last test
      
      * Add DPR
      
      * Finish Electra and add FlauBERT
      
      * Add GPT2
      
      * Add Longformer
      
      * Add MMBT
      
      * Add MobileBert
      
      * Add GPT
      
      * Formatting
      
      * Add Reformer
      
      * Add Roberta
      
      * Add T5
      
      * Add Transformer XL
      
      * Fix test
      
      * Add XLM + fix XLMForTokenClassification
      
      * Style + XLMRoberta
      
      * Add XLNet
      
      * Formatting
      
      * Add doc of return_tuple arg
      edfd82f5
  3. 01 Jul, 2020 1 commit
  4. 16 Jun, 2020 1 commit
  5. 10 Jun, 2020 1 commit
  6. 09 Jun, 2020 1 commit
    • Bharat Raghunathan's avatar
      [All models] Extend config.output_attentions with output_attentions function arguments (#4538) · 6e603cb7
      Bharat Raghunathan authored
      
      
      * DOC: Replace instances of ``config.output_attentions`` with function argument ``output_attentions``
      
      * DOC: Apply Black Formatting
      
      * Fix errors where output_attentions was undefined
      
      * Remove output_attentions in classes per review
      
      * Fix regressions on tests having `output_attention`
      
      * Fix further regressions in tests relating to `output_attentions`
      
      Ensure proper propagation of `output_attentions` as a function parameter
      to all model subclasses
      
      * Fix more regressions in `test_output_attentions`
      
      * Fix issues with BertEncoder
      
      * Rename related variables to `output_attentions`
      
      * fix pytorch tests
      
      * fix bert and gpt2 tf
      
      * Fix most TF tests for `test_output_attentions`
      
      * Fix linter errors and more TF tests
      
      * fix conflicts
      
      * DOC: Apply Black Formatting
      
      * Fix errors where output_attentions was undefined
      
      * Remove output_attentions in classes per review
      
      * Fix regressions on tests having `output_attention`
      
      * fix conflicts
      
      * fix conflicts
      
      * fix conflicts
      
      * fix conflicts
      
      * fix pytorch tests
      
      * fix conflicts
      
      * fix conflicts
      
      * Fix linter errors and more TF tests
      
      * fix tf tests
      
      * make style
      
      * fix isort
      
      * improve output_attentions
      
      * improve tensorflow
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      6e603cb7
  7. 02 Jun, 2020 1 commit
    • Julien Chaumond's avatar
      Kill model archive maps (#4636) · d4c2cb40
      Julien Chaumond authored
      * Kill model archive maps
      
      * Fixup
      
      * Also kill model_archive_map for MaskedBertPreTrainedModel
      
      * Unhook config_archive_map
      
      * Tokenizers: align with model id changes
      
      * make style && make quality
      
      * Fix CI
      d4c2cb40
  8. 27 May, 2020 1 commit
  9. 19 May, 2020 2 commits
  10. 01 May, 2020 1 commit
    • Julien Chaumond's avatar
      [ci] Load pretrained models into the default (long-lived) cache · f54dc3f4
      Julien Chaumond authored
      There's an inconsistency right now where:
      - we load some models into CACHE_DIR
      - and some models in the default cache
      - and often, in both for the same models
      
      When running the RUN_SLOW tests, this takes a lot of disk space, time, and bandwidth.
      
      I'd rather always use the default cache
      f54dc3f4
  11. 08 Mar, 2020 2 commits
  12. 25 Feb, 2020 2 commits
  13. 24 Feb, 2020 1 commit
    • Patrick von Platen's avatar
      Add slow generate tests for pretrained lm models (#2909) · 17c45c39
      Patrick von Platen authored
      * add slow generate lm_model tests
      
      * fix conflicts
      
      * merge conflicts
      
      * fix conflicts
      
      * add slow generate lm_model tests
      
      * make style
      
      * delete unused variable
      
      * fix conflicts
      
      * fix conflicts
      
      * fix conflicts
      
      * delete unused variable
      
      * fix conflicts
      
      * finished hard coded tests
      17c45c39
  14. 21 Feb, 2020 1 commit
    • Patrick von Platen's avatar
      Improve special_token_id logic in run_generation.py and add tests (#2885) · fc38d4c8
      Patrick von Platen authored
      
      
      * improving generation
      
      * finalized special token behaviour for no_beam_search generation
      
      * solved modeling_utils merge conflict
      
      * solve merge conflicts in modeling_utils.py
      
      * add run_generation improvements from PR #2749
      
      * adapted language generation to not use hardcoded -1 if no padding token is available
      
      * remove the -1 removal as hard coded -1`s are not necessary anymore
      
      * add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown
      
      * add slow language generation tests for pretrained models using hardcoded output with pytorch seed
      
      * delete ipdb
      
      * check that all generated tokens are valid
      
      * renaming
      
      * renaming Generation -> Generate
      
      * make style
      
      * updated so that generate_beam_search has same token behavior than generate_no_beam_search
      
      * consistent return format for run_generation.py
      
      * deleted pretrain lm generate tests -> will be added in another PR
      
      * cleaning of unused if statements and renaming
      
      * run_generate will always return an iterable
      
      * make style
      
      * consistent renaming
      
      * improve naming, make sure generate function always returns the same tensor, add docstring
      
      * add slow tests for all lmhead models
      
      * make style and improve example comments modeling_utils
      
      * better naming and refactoring in modeling_utils
      
      * improving generation
      
      * finalized special token behaviour for no_beam_search generation
      
      * solved modeling_utils merge conflict
      
      * solve merge conflicts in modeling_utils.py
      
      * add run_generation improvements from PR #2749
      
      * adapted language generation to not use hardcoded -1 if no padding token is available
      
      * remove the -1 removal as hard coded -1`s are not necessary anymore
      
      * add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown
      
      * add slow language generation tests for pretrained models using hardcoded output with pytorch seed
      
      * delete ipdb
      
      * check that all generated tokens are valid
      
      * renaming
      
      * renaming Generation -> Generate
      
      * make style
      
      * updated so that generate_beam_search has same token behavior than generate_no_beam_search
      
      * consistent return format for run_generation.py
      
      * deleted pretrain lm generate tests -> will be added in another PR
      
      * cleaning of unused if statements and renaming
      
      * run_generate will always return an iterable
      
      * make style
      
      * consistent renaming
      
      * improve naming, make sure generate function always returns the same tensor, add docstring
      
      * add slow tests for all lmhead models
      
      * make style and improve example comments modeling_utils
      
      * better naming and refactoring in modeling_utils
      
      * changed fast random lm generation testing design to more general one
      
      * delete in old testing design in gpt2
      
      * correct old variable name
      
      * temporary fix for encoder_decoder lm generation tests - has to be updated when t5 is fixed
      
      * adapted all fast random generate tests to new design
      
      * better warning description in modeling_utils
      
      * better comment
      
      * better comment and error message
      Co-authored-by: default avatarThomas Wolf <thomwolf@users.noreply.github.com>
      fc38d4c8
  15. 06 Jan, 2020 2 commits
  16. 22 Dec, 2019 8 commits
  17. 21 Dec, 2019 2 commits
    • Aymeric Augustin's avatar
      Reformat source code with black. · fa84ae26
      Aymeric Augustin authored
      This is the result of:
      
          $ black --line-length 119 examples templates transformers utils hubconf.py setup.py
      
      There's a lot of fairly long lines in the project. As a consequence, I'm
      picking the longest widely accepted line length, 119 characters.
      
      This is also Thomas' preference, because it allows for explicit variable
      names, to make the code easier to understand.
      fa84ae26
    • Aymeric Augustin's avatar
      Take advantage of the cache when running tests. · b670c266
      Aymeric Augustin authored
      Caching models across test cases and across runs of the test suite makes
      slow tests somewhat more bearable.
      
      Use gettempdir() instead of /tmp in tests. This makes it easier to
      change the location of the cache with semi-standard TMPDIR/TEMP/TMP
      environment variables.
      
      Fix #2222.
      b670c266
  18. 13 Dec, 2019 1 commit
  19. 06 Dec, 2019 1 commit
    • Aymeric Augustin's avatar
      Remove dependency on pytest for running tests (#2055) · 35401fe5
      Aymeric Augustin authored
      * Switch to plain unittest for skipping slow tests.
      
      Add a RUN_SLOW environment variable for running them.
      
      * Switch to plain unittest for PyTorch dependency.
      
      * Switch to plain unittest for TensorFlow dependency.
      
      * Avoid leaking open files in the test suite.
      
      This prevents spurious warnings when running tests.
      
      * Fix unicode warning on Python 2 when running tests.
      
      The warning was:
      
          UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
      
      * Support running PyTorch tests on a GPU.
      
      Reverts 27e015bd.
      
      * Tests no longer require pytest.
      
      * Make tests pass on cuda
      35401fe5
  20. 05 Dec, 2019 1 commit
  21. 30 Nov, 2019 1 commit
    • Rostislav Nedelchev's avatar
      [XLNet] Changed post-processing of attention w.r.t to target_mapping · 76c0bc06
      Rostislav Nedelchev authored
      Whenever target_mapping is provided to the input, XLNet outputs two different attention streams.
      Based on that the attention output would be on of the two:
      - a list of tensors (usual case for most transformers)
      - a list of 2-tuples of tensors, one tesor for each of attention streams
      Docs and unit-tests have been updated
      76c0bc06
  22. 19 Nov, 2019 1 commit
  23. 11 Oct, 2019 2 commits
  24. 26 Sep, 2019 1 commit
  25. 09 Sep, 2019 1 commit
  26. 08 Sep, 2019 2 commits