1. 20 Aug, 2020 2 commits
    • Joe Davison's avatar
      add intro to nlp lib & dataset links to custom datasets tutorial (#6583) · 039d8d65
      Joe Davison authored
      * add intro to nlp lib + links
      
      * unique links...
      039d8d65
    • Romain Rigaux's avatar
      Docs copy button misses ... prefixed code (#6518) · cabfdfaf
      Romain Rigaux authored
      Tested in a local build of the docs.
      
      e.g. Just above https://huggingface.co/transformers/task_summary.html#causal-language-modeling
      
      Copy will copy the full code, e.g.
      
      for token in top_5_tokens:
           print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token])))
      
      Instead of currently only:
      
      for token in top_5_tokens:
      
      
      >>> for token in top_5_tokens:
      ...     print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token])))
      Distilled models are smaller than the models they mimic. Using them instead of the large versions would help reduce our carbon footprint.
      Distilled models are smaller than the models they mimic. Using them instead of the large versions would help increase our carbon footprint.
      Distilled models are smaller than the models they mimic. Using them instead of the large versions would help decrease our carbon footprint.
      Distilled models are smaller than the models they mimic. Using them instead of the large versions would help offset our carbon footprint.
      Distilled models are smaller than the models they mimic. Using them instead of the large versions would help improve our carbon footprint.
      
      Docs for the option fix:
      https://sphinx-copybutton.readthedocs.io/en/latest/
      cabfdfaf
  2. 19 Aug, 2020 1 commit
  3. 18 Aug, 2020 4 commits
  4. 17 Aug, 2020 9 commits
  5. 14 Aug, 2020 3 commits
  6. 12 Aug, 2020 3 commits
  7. 11 Aug, 2020 2 commits
  8. 10 Aug, 2020 4 commits
  9. 07 Aug, 2020 1 commit
  10. 05 Aug, 2020 1 commit
    • Sylvain Gugger's avatar
      Tf model outputs (#6247) · c67d1a02
      Sylvain Gugger authored
      * TF outputs and test on BERT
      
      * Albert to DistilBert
      
      * All remaining TF models except T5
      
      * Documentation
      
      * One file forgotten
      
      * TF outputs and test on BERT
      
      * Albert to DistilBert
      
      * All remaining TF models except T5
      
      * Documentation
      
      * One file forgotten
      
      * Add new models and fix issues
      
      * Quality improvements
      
      * Add T5
      
      * A bit of cleanup
      
      * Fix for slow tests
      
      * Style
      c67d1a02
  11. 04 Aug, 2020 1 commit
  12. 03 Aug, 2020 2 commits
  13. 01 Aug, 2020 1 commit
  14. 31 Jul, 2020 3 commits
  15. 30 Jul, 2020 3 commits
    • Sylvain Gugger's avatar
      Doc tokenizer (#6110) · f3065abd
      Sylvain Gugger authored
      
      
      * Start doc tokenizers
      
      * Tokenizer documentation
      
      * Start doc tokenizers
      
      * Tokenizer documentation
      
      * Formatting after rebase
      
      * Formatting after merge
      
      * Update docs/source/main_classes/tokenizer.rst
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Address comment
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarThomas Wolf <thomwolf@users.noreply.github.com>
      
      * Address Thom's comments
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarThomas Wolf <thomwolf@users.noreply.github.com>
      f3065abd
    • guillaume-be's avatar
      Addition of a DialoguePipeline (#5516) · e642c789
      guillaume-be authored
      
      
      * initial commit for pipeline implementation
      
      Addition of input processing and history concatenation
      
      * Conversation pipeline tested and working for single & multiple conversation inputs
      
      * Added docstrings for dialogue pipeline
      
      * Addition of dialogue pipeline integration tests
      
      * Delete test_t5.py
      
      * Fixed max code length
      
      * Updated styling
      
      * Fixed test broken by formatting tools
      
      * Removed unused import
      
      * Added unit test for DialoguePipeline
      
      * Fixed Tensorflow compatibility
      
      * Fixed multi-framework support using framework flag
      
      * - Fixed docstring
      - Added `min_length_for_response` as an initialization parameter
      - Renamed `*args` to `conversations`, `conversations` being a `Conversation` or a `List[Conversation]`
      - Updated truncation to truncate entire segments of conversations, instead of cutting in the middle of a user/bot input
      
      * - renamed pipeline name from dialogue to conversational
      - removed hardcoded default value of 1000 and use config.max_length instead
      - added `append_response` and `set_history` method to the Conversation class to avoid direct fields mutation
      - fixed bug in history truncation method
      
      * - Updated ConversationalPipeline to accept only active conversations (otherwise a ValueError is raised)
      
      * - Simplified input tensor conversion
      
      * - Updated attention_mask value for Tensorflow compatibility
      
      * - Updated last dialogue reference to conversational & fixed integration tests
      
      * Fixed conflict with master
      
      * Updates following review comments
      
      * Updated formatting
      
      * Added Conversation and ConversationalPipeline to the library __init__, addition of docstrings for Conversation, added both to the docs
      
      * Update src/transformers/pipelines.py
      
      Updated docsting following review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      e642c789
    • Sylvain Gugger's avatar
      Switch from return_tuple to return_dict (#6138) · 91cb9546
      Sylvain Gugger authored
      
      
      * Switch from return_tuple to return_dict
      
      * Fix test
      
      * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614)
      
      * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests
      
      * AutoModels
      
      
      Tiny tweaks
      
      * Style
      
      * Final changes before merge
      
      * Re-order for simpler review
      
      * Final fixes
      
      * Addressing @sgugger's comments
      
      * Test MultipleChoice
      
      * Rework TF trainer (#6038)
      
      * Fully rework training/prediction loops
      
      * fix method name
      
      * Fix variable name
      
      * Fix property name
      
      * Fix scope
      
      * Fix method name
      
      * Fix tuple index
      
      * Fix tuple index
      
      * Fix indentation
      
      * Fix variable name
      
      * fix eval before log
      
      * Add drop remainder for test dataset
      
      * Fix step number + fix logging datetime
      
      * fix eval loss value
      
      * use global step instead of step + fix logging at step 0
      
      * Fix logging datetime
      
      * Fix global_step usage
      
      * Fix breaking loop + logging datetime
      
      * Fix step in prediction loop
      
      * Fix step breaking
      
      * Fix train/test loops
      
      * Force TF at least 2.2 for the trainer
      
      * Use assert_cardinality to facilitate the dataset size computation
      
      * Log steps per epoch
      
      * Make tfds compliant with TPU
      
      * Make tfds compliant with TPU
      
      * Use TF dataset enumerate instead of the Python one
      
      * revert previous commit
      
      * Fix data_dir
      
      * Apply style
      
      * rebase on master
      
      * Address Sylvain's comments
      
      * Address Sylvain's and Lysandre comments
      
      * Trigger CI
      
      * Remove unused import
      
      * Switch from return_tuple to return_dict
      
      * Fix test
      
      * Add recent model
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarJulien Plu <plu.julien@gmail.com>
      91cb9546