1. 10 Nov, 2020 5 commits
  2. 09 Nov, 2020 3 commits
  3. 06 Nov, 2020 1 commit
  4. 05 Nov, 2020 2 commits
  5. 04 Nov, 2020 3 commits
  6. 03 Nov, 2020 7 commits
    • Ceyda Cinarel's avatar
      [WIP] Ner pipeline grouped_entities fixes (#5970) · 29b536a7
      Ceyda Cinarel authored
      
      
      * Bug fix: NER pipeline shouldn't group separate entities of same type
      
      * style fix
      
      * [Bug Fix] Shouldn't group entities that are both 'B' even if they are same type
      	(B-type1 B-type1) != (B-type1 I-type1)
      [Bug Fix] add an option `ignore_subwords` to ignore subsequent ##wordpieces in predictions. Because some models train on only the first token of a word and not on the subsequent wordpieces (BERT NER default). So it makes sense doing the same thing at inference time.
      	The simplest fix is to just group the subwords with the first wordpiece.
      	[TODO] how to handle ignored scores? just set them to 0 and calculate zero invariant mean ?
      	[TODO] handle different wordpiece_prefix ## ? possible approaches:
      		get it from tokenizer? but currently most tokenizers dont have a wordpiece_prefix property?
      		have an _is_subword(token)
      [Feature add] added option to `skip_special_tokens`. Cause It was harder to remove them after grouping.
      [Additional Changes] remove B/I prefix on returned grouped_entities
      [Feature Request/TODO] Return indexes?
      [Bug TODO]  can't use fast tokenizer with grouped_entities ('BertTokenizerFast' object has no attribute 'convert_tokens_to_string')
      
      * use offset_mapping to fix [UNK] token problem
      
      * ignore score for subwords
      
      * modify ner_pipeline test
      
      * modify ner_pipeline test
      
      * modify ner_pipeline test
      
      * ner_pipeline change ignore_subwords default to true
      
      * add ner_pipeline ignore_subword=False test case
      
      * fix offset_mapping index
      
      * fix style again duh
      
      * change is_subword and convert_tokens_to_string logic
      
      * merge tests with new test structure
      
      * change test names
      
      * remove old tests
      
      * ner tests for fast tokenizer
      
      * fast tokenizers have convert_tokens_to_string
      
      * Fix the incorrect merge
      Co-authored-by: default avatarCeyda Cinarel <snu-ceyda@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      29b536a7
    • Stas Bekman's avatar
      [CIs] Better reports everywhere (#8275) · 1bb4bba5
      Stas Bekman authored
      * make it possible to invoke testconf.py in both test suites without crashing on having the same option added
      
      * perl -pi -e 's|--make_reports|--make-reports|' to be consistent with other opts
      
      * add `pytest --make-reports` to all CIs (and artifacts)
      
      * fix
      1bb4bba5
    • Sylvain Gugger's avatar
      Data collator for token classification (#8274) · 7f556d2e
      Sylvain Gugger authored
      * Add DataCollatorForTokenClassification and clean tests
      
      * Make quality
      7f556d2e
    • Sylvain Gugger's avatar
      4c19f3ba
    • guillaume-be's avatar
      Updated ConversationalPipeline to work with encoder-decoder models (#8207) · 74f6f91a
      guillaume-be authored
      
      
      * Updated ConversationalPipeline to work with encoder-decoder models (e.g. BlenderBot)
      
      * Addition of integration test for EncoderDecoder conversation model
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      74f6f91a
    • Nicolas Patry's avatar
      [FIX] TextGenerationPipeline is currently broken. (#8256) · c66ffa3a
      Nicolas Patry authored
      * [FIX] TextGenerationPipeline is currently broken.
      
      It's most likely due to #8180.
      What's missing is a multi vs single string handler at the beginning of
      the pipe.
      And also there was no testing of this pipeline.
      
      * Fixing Conversational tests too.
      c66ffa3a
    • Patrick von Platen's avatar
      Refactoring the generate() function (#6949) · a1bbcf3f
      Patrick von Platen authored
      * first draft
      
      * show design proposition for new generate method
      
      * up
      
      * make better readable
      
      * make first version
      
      * gpt2 tests pass
      
      * make beam search for gpt2 work
      
      * add first encoder-decoder code
      
      * delete typo
      
      * make t5 work
      
      * save indermediate
      
      * make bart work with beam search
      
      * finish beam search bart / t5
      
      * add default kwargs
      
      * make more tests pass
      
      * fix no bad words sampler
      
      * some fixes and tests for all distribution processors
      
      * fix test
      
      * fix rag slow tests
      
      * merge to master
      
      * add nograd to generate
      
      * make all slow tests pass
      
      * speed up generate
      
      * fix edge case bug
      
      * small fix
      
      * correct typo
      
      * add type hints and docstrings
      
      * fix typos in tests
      
      * add beam search tests
      
      * add tests for beam scorer
      
      * fix test rag
      
      * finish beam search tests
      
      * move generation tests in seperate file
      
      * fix generation tests
      
      * more tests
      
      * add aggressive generation tests
      
      * fix tests
      
      * add gpt2 sample test
      
      * add more docstring
      
      * add more docs
      
      * finish doc strings
      
      * apply some more of sylvains and sams comments
      
      * fix some typos
      
      * make fix copies
      
      * apply lysandres and sylvains comments
      
      * final corrections on examples
      
      * small fix for reformer
      a1bbcf3f
  7. 02 Nov, 2020 3 commits
  8. 30 Oct, 2020 3 commits
    • TFUsers's avatar
      Replace swish with silu (#8166) · 00112c35
      TFUsers authored
      
      
      * Replace swish with silu
      
      * revert nn.silu to nn.swish due to older version
      
      * simplify optimized silu conditional and fix format
      
      * Update activations.py
      
      * Update activations_tf.py
      
      * Update modeling_flax_utils.py
      
      * Update modeling_openai.py
      
      * add swish testcase
      
      * add pytorch swish testcase
      
      * Add more robust python version check
      
      * more formatting fixes
      Co-authored-by: default avatarTFUsers <TFUsers@gmail.com>
      00112c35
    • Sam Shleifer's avatar
      TFMarian, TFMbart, TFPegasus, TFBlenderbot (#7987) · 566b083e
      Sam Shleifer authored
      
      
      * Start plumbing
      
      * Marian close
      
      * Small stubs for all children
      
      * Fixed bart
      
      * marian working
      
      * pegasus test is good, but failing
      
      * Checkin tests
      
      * More model files
      
      * Subtle marian, pegasus integration test failures
      
      * Works well
      
      * rm print
      
      * boom boom
      
      * Still failing model2doc
      
      * merge master
      
      * Equivalence test failing, all others fixed
      
      * cleanup
      
      * Fix embed_scale
      
      * Cleanup marian pipeline test
      
      * Undo extra changes
      
      * Smaller delta
      
      * Cleanup model testers
      
      * undo delta
      
      * fix tests import structure
      
      * cross test decorator
      
      * Cleaner set_weights
      
      * Respect authorized_unexpected_keys
      
      * No warnings
      
      * No warnings
      
      * style
      
      * Nest tf import
      
      * black
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * functional dropout
      
      * fixup
      
      * Fixup
      
      * style_doc
      
      * embs
      
      * shape list
      
      * delete slow force_token_id_to_be_generated func
      
      * fixup
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      566b083e
    • Lysandre Debut's avatar
      Ci test tf super slow (#8007) · 10f8c636
      Lysandre Debut authored
      * Test TF GPU CI
      
      * Change cache
      
      * Fix missing torch requirement
      
      * Fix some model tests
      
      
      Style
      
      * LXMERT
      
      * MobileBERT
      
      * Longformer skip test
      
      * XLNet
      
      * The rest of the tests
      
      * RAG goes OOM in multi gpu setup
      
      * YAML test files
      
      * Last fixes
      
      * Skip doctests
      
      * Fill mask tests
      
      * Yaml files
      
      * Last test fix
      
      * Style
      
      * Update cache
      
      * Change ONNX tests to slow + use tiny model
      10f8c636
  9. 29 Oct, 2020 2 commits
  10. 28 Oct, 2020 1 commit
  11. 27 Oct, 2020 3 commits
  12. 26 Oct, 2020 5 commits
  13. 23 Oct, 2020 2 commits