1. 07 Aug, 2020 1 commit
  2. 05 Aug, 2020 1 commit
    • Sylvain Gugger's avatar
      Tf model outputs (#6247) · c67d1a02
      Sylvain Gugger authored
      * TF outputs and test on BERT
      
      * Albert to DistilBert
      
      * All remaining TF models except T5
      
      * Documentation
      
      * One file forgotten
      
      * TF outputs and test on BERT
      
      * Albert to DistilBert
      
      * All remaining TF models except T5
      
      * Documentation
      
      * One file forgotten
      
      * Add new models and fix issues
      
      * Quality improvements
      
      * Add T5
      
      * A bit of cleanup
      
      * Fix for slow tests
      
      * Style
      c67d1a02
  3. 04 Aug, 2020 1 commit
  4. 03 Aug, 2020 2 commits
  5. 01 Aug, 2020 1 commit
  6. 31 Jul, 2020 3 commits
  7. 30 Jul, 2020 4 commits
    • Sylvain Gugger's avatar
      Doc tokenizer (#6110) · f3065abd
      Sylvain Gugger authored
      
      
      * Start doc tokenizers
      
      * Tokenizer documentation
      
      * Start doc tokenizers
      
      * Tokenizer documentation
      
      * Formatting after rebase
      
      * Formatting after merge
      
      * Update docs/source/main_classes/tokenizer.rst
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Address comment
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarThomas Wolf <thomwolf@users.noreply.github.com>
      
      * Address Thom's comments
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarThomas Wolf <thomwolf@users.noreply.github.com>
      f3065abd
    • guillaume-be's avatar
      Addition of a DialoguePipeline (#5516) · e642c789
      guillaume-be authored
      
      
      * initial commit for pipeline implementation
      
      Addition of input processing and history concatenation
      
      * Conversation pipeline tested and working for single & multiple conversation inputs
      
      * Added docstrings for dialogue pipeline
      
      * Addition of dialogue pipeline integration tests
      
      * Delete test_t5.py
      
      * Fixed max code length
      
      * Updated styling
      
      * Fixed test broken by formatting tools
      
      * Removed unused import
      
      * Added unit test for DialoguePipeline
      
      * Fixed Tensorflow compatibility
      
      * Fixed multi-framework support using framework flag
      
      * - Fixed docstring
      - Added `min_length_for_response` as an initialization parameter
      - Renamed `*args` to `conversations`, `conversations` being a `Conversation` or a `List[Conversation]`
      - Updated truncation to truncate entire segments of conversations, instead of cutting in the middle of a user/bot input
      
      * - renamed pipeline name from dialogue to conversational
      - removed hardcoded default value of 1000 and use config.max_length instead
      - added `append_response` and `set_history` method to the Conversation class to avoid direct fields mutation
      - fixed bug in history truncation method
      
      * - Updated ConversationalPipeline to accept only active conversations (otherwise a ValueError is raised)
      
      * - Simplified input tensor conversion
      
      * - Updated attention_mask value for Tensorflow compatibility
      
      * - Updated last dialogue reference to conversational & fixed integration tests
      
      * Fixed conflict with master
      
      * Updates following review comments
      
      * Updated formatting
      
      * Added Conversation and ConversationalPipeline to the library __init__, addition of docstrings for Conversation, added both to the docs
      
      * Update src/transformers/pipelines.py
      
      Updated docsting following review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      e642c789
    • Sylvain Gugger's avatar
      Switch from return_tuple to return_dict (#6138) · 91cb9546
      Sylvain Gugger authored
      
      
      * Switch from return_tuple to return_dict
      
      * Fix test
      
      * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614)
      
      * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests
      
      * AutoModels
      
      
      Tiny tweaks
      
      * Style
      
      * Final changes before merge
      
      * Re-order for simpler review
      
      * Final fixes
      
      * Addressing @sgugger's comments
      
      * Test MultipleChoice
      
      * Rework TF trainer (#6038)
      
      * Fully rework training/prediction loops
      
      * fix method name
      
      * Fix variable name
      
      * Fix property name
      
      * Fix scope
      
      * Fix method name
      
      * Fix tuple index
      
      * Fix tuple index
      
      * Fix indentation
      
      * Fix variable name
      
      * fix eval before log
      
      * Add drop remainder for test dataset
      
      * Fix step number + fix logging datetime
      
      * fix eval loss value
      
      * use global step instead of step + fix logging at step 0
      
      * Fix logging datetime
      
      * Fix global_step usage
      
      * Fix breaking loop + logging datetime
      
      * Fix step in prediction loop
      
      * Fix step breaking
      
      * Fix train/test loops
      
      * Force TF at least 2.2 for the trainer
      
      * Use assert_cardinality to facilitate the dataset size computation
      
      * Log steps per epoch
      
      * Make tfds compliant with TPU
      
      * Make tfds compliant with TPU
      
      * Use TF dataset enumerate instead of the Python one
      
      * revert previous commit
      
      * Fix data_dir
      
      * Apply style
      
      * rebase on master
      
      * Address Sylvain's comments
      
      * Address Sylvain's and Lysandre comments
      
      * Trigger CI
      
      * Remove unused import
      
      * Switch from return_tuple to return_dict
      
      * Fix test
      
      * Add recent model
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarJulien Plu <plu.julien@gmail.com>
      91cb9546
    • Oren Amsalem's avatar
      Actually the extra_id are from 0-99 and not from 1-100 (#5967) · d24ea708
      Oren Amsalem authored
      a = tokenizer.encode("we got a <extra_id_99>", return_tensors='pt',add_special_tokens=True)
      print(a)
      >tensor([[   62,   530,     3,     9, 32000]])
      a = tokenizer.encode("we got a <extra_id_100>", return_tensors='pt',add_special_tokens=True)
      print(a)
      >tensor([[   62,   530,     3,     9,     3,     2, 25666,   834,    23,    26,
                 834,  2915,  3155]])
      d24ea708
  8. 29 Jul, 2020 2 commits
  9. 27 Jul, 2020 1 commit
  10. 24 Jul, 2020 1 commit
  11. 22 Jul, 2020 1 commit
  12. 21 Jul, 2020 1 commit
  13. 20 Jul, 2020 1 commit
  14. 14 Jul, 2020 1 commit
  15. 13 Jul, 2020 2 commits
  16. 10 Jul, 2020 2 commits
  17. 09 Jul, 2020 2 commits
  18. 08 Jul, 2020 1 commit
  19. 07 Jul, 2020 4 commits
    • Joe Davison's avatar
      Guide to fixed-length model perplexity evaluation (#5449) · b4b33fdf
      Joe Davison authored
      * add first draft ppl guide
      
      * upload imgs
      
      * expand on strides
      
      * ref typo
      
      * rm superfluous past var
      
      * add tokenization disclaimer
      b4b33fdf
    • Sam Shleifer's avatar
      Add mbart-large-cc25, support translation finetuning (#5129) · 353b8f1e
      Sam Shleifer authored
      improve unittests for finetuning, especially w.r.t testing frozen parameters
      fix freeze_embeds for T5
      add streamlit setup.cfg
      353b8f1e
    • Suraj Patil's avatar
      [docs] fix model_doc links in model summary (#5566) · 33e43edd
      Suraj Patil authored
      * fix model_doc links
      
      * update model links
      33e43edd
    • Quentin Lhoest's avatar
      Add DPR model (#5279) · fbd87921
      Quentin Lhoest authored
      
      
      * beginning of dpr modeling
      
      * wip
      
      * implement forward
      
      * remove biencoder + better init weights
      
      * export dpr model to embed model for nlp lib
      
      * add new api
      
      * remove old code
      
      * make style
      
      * fix dumb typo
      
      * don't load bert weights
      
      * docs
      
      * docs
      
      * style
      
      * move the `k` parameter
      
      * fix init_weights
      
      * add pretrained configs
      
      * minor
      
      * update config names
      
      * style
      
      * better config
      
      * style
      
      * clean code based on PR comments
      
      * change Dpr to DPR
      
      * fix config
      
      * switch encoder config to a dict
      
      * style
      
      * inheritance -> composition
      
      * add messages in assert startements
      
      * add dpr reader tokenizer
      
      * one tokenizer per model
      
      * fix base_model_prefix
      
      * fix imports
      
      * typo
      
      * add convert script
      
      * docs
      
      * change tokenizers conf names
      
      * style
      
      * change tokenizers conf names
      
      * minor
      
      * minor
      
      * fix wrong names
      
      * minor
      
      * remove unused convert functions
      
      * rename convert script
      
      * use return_tensors in tokenizers
      
      * remove n_questions dim
      
      * move generate logic to tokenizer
      
      * style
      
      * add docs
      
      * docs
      
      * quality
      
      * docs
      
      * add tests
      
      * style
      
      * add tokenization tests
      
      * DPR full tests
      
      * Stay true to the attention mask building
      
      * update docs
      
      * missing param in bert input docs
      
      * docs
      
      * style
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      fbd87921
  20. 06 Jul, 2020 4 commits
  21. 02 Jul, 2020 2 commits
  22. 01 Jul, 2020 2 commits