"magic_pdf/model/vscode:/vscode.git/clone" did not exist on "918ed65bd5234c2c5680b36a60884e992288105b"
  1. 15 Jun, 2020 2 commits
    • Anthony MOI's avatar
      [HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized... · 36434220
      Anthony MOI authored
      
      [HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510)
      
      * Use tokenizers pre-tokenized pipeline
      
      * failing pretrokenized test
      
      * Fix is_pretokenized in python
      
      * add pretokenized tests
      
      * style and quality
      
      * better tests for batched pretokenized inputs
      
      * tokenizers clean up - new padding_strategy - split the files
      
      * [HUGE] refactoring tokenizers - padding - truncation - tests
      
      * style and quality
      
      * bump up requied tokenizers version to 0.8.0-rc1
      
      * switched padding/truncation API - simpler better backward compat
      
      * updating tests for custom tokenizers
      
      * style and quality - tests on pad
      
      * fix QA pipeline
      
      * fix backward compatibility for max_length only
      
      * style and quality
      
      * Various cleans up - add verbose
      
      * fix tests
      
      * update docstrings
      
      * Fix tests
      
      * Docs reformatted
      
      * __call__ method documented
      Co-authored-by: default avatarThomas Wolf <thomwolf@users.noreply.github.com>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      36434220
    • Sam Shleifer's avatar
      Add bart-base (#5014) · a9f1fc6c
      Sam Shleifer authored
      a9f1fc6c
  2. 12 Jun, 2020 2 commits
  3. 10 Jun, 2020 1 commit
    • Suraj Patil's avatar
      ElectraForQuestionAnswering (#4913) · ef2dcdcc
      Suraj Patil authored
      * ElectraForQuestionAnswering
      
      * udate __init__
      
      * add test for electra qa model
      
      * add ElectraForQuestionAnswering in auto models
      
      * add ElectraForQuestionAnswering in all_model_classes
      
      * fix outputs, input_ids defaults to None
      
      * add ElectraForQuestionAnswering in docs
      
      * remove commented line
      ef2dcdcc
  4. 09 Jun, 2020 1 commit
  5. 08 Jun, 2020 1 commit
  6. 06 Jun, 2020 1 commit
  7. 05 Jun, 2020 3 commits
  8. 04 Jun, 2020 1 commit
  9. 03 Jun, 2020 1 commit
    • Julien Chaumond's avatar
      Pipelines: miscellanea of QoL improvements and small features... (#4632) · 99207bd1
      Julien Chaumond authored
      * [hf_api] Attach all unknown attributes for future-proof compatibility
      
      * [Pipeline] NerPipeline is really a TokenClassificationPipeline
      
      * modelcard.py: I don't think we need to force the download
      
      * Remove config, tokenizer from SUPPORTED_TASKS as we're moving to one model = one weight + one tokenizer
      
      * FillMaskPipeline: also output token in string form
      
      * TextClassificationPipeline: option to return all scores, not just the argmax
      
      * Update docs/source/main_classes/pipelines.rst
      99207bd1
  10. 02 Jun, 2020 3 commits
  11. 29 May, 2020 2 commits
  12. 27 May, 2020 1 commit
  13. 26 May, 2020 1 commit
  14. 25 May, 2020 1 commit
  15. 22 May, 2020 2 commits
  16. 19 May, 2020 2 commits
    • Patrick von Platen's avatar
      [Longformer] Docs and clean API (#4464) · 48c3a70b
      Patrick von Platen authored
      * add longformer docs
      
      * improve docs
      48c3a70b
    • Iz Beltagy's avatar
      Longformer (#4352) · 8f1d0471
      Iz Beltagy authored
      * first commit
      
      * bug fixes
      
      * better examples
      
      * undo padding
      
      * remove wrong VOCAB_FILES_NAMES
      
      * License
      
      * make style
      
      * make isort happy
      
      * unit tests
      
      * integration test
      
      * make `black` happy by undoing `isort` changes!!
      
      * lint
      
      * no need for the padding value
      
      * batch_size not bsz
      
      * remove unused type casting
      
      * seqlen not seq_len
      
      * staticmethod
      
      * `bert` selfattention instead of `n2`
      
      * uint8 instead of bool + lints
      
      * pad inputs_embeds using embeddings not a constant
      
      * black
      
      * unit test with padding
      
      * fix unit tests
      
      * remove redundant unit test
      
      * upload model weights
      
      * resolve todo
      
      * simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_
      
      * increase unittest coverage
      8f1d0471
  17. 18 May, 2020 1 commit
  18. 13 May, 2020 3 commits
  19. 11 May, 2020 5 commits
  20. 10 May, 2020 2 commits
  21. 07 May, 2020 4 commits