1. 18 Feb, 2021 2 commits
  2. 17 Feb, 2021 4 commits
  3. 16 Feb, 2021 2 commits
  4. 15 Feb, 2021 5 commits
  5. 13 Feb, 2021 1 commit
    • Nicolas Patry's avatar
      Conversion from slow to fast for BPE spm vocabs contained an error. (#10120) · c9837a0d
      Nicolas Patry authored
      * Conversion from slow to fast for BPE spm vocabs contained an error.
      
      - There is only 1 test currently (tokenizers + slow) that used the modified path
      and it's reformer, which does not contain any ids modification so the
      bug was silent for now.
      - The real issue is that vocab variable was overloaded by
      SentencePieceExtractor, leading to Slow specific vocab oddities to be
      completely ignored
      - The bug was reported here https://github.com/huggingface/transformers/issues/9518
      - Ran the complete tokenization test suite with slow without error
      (`RUN_SLOW=1 pytest -sv tests/test_tokenization_*`)
      
      * Remove rebase error.
      
      * Adding the fixture.
      c9837a0d
  6. 12 Feb, 2021 2 commits
  7. 11 Feb, 2021 1 commit
  8. 10 Feb, 2021 3 commits
    • Suraj Patil's avatar
      remove adjust_logits_during_generation method (#10087) · c130e67d
      Suraj Patil authored
      * add forced logits processors
      
      * delete adjust_logits method
      
      * add forced_eos_token_id argument in config
      
      * add tests for forced logits processors
      
      * update gen utils tests
      
      * add forced option to tf generate
      
      * remove adjust_logits method from tf models
      
      * update adjust_logits for marian
      
      * delete _force_token_id_to_be_generated method
      
      * style
      
      * import warnings
      
      * pass max_length to _get_logits_processor
      
      * set forced_eos_token_id to None
      
      * set forced attributes in conf utils
      
      * typo
      
      * fix rag generate
      
      * add forced_eos_token_id in rag config
      
      * remove force_bos_token_to_be_generated from BartConfig
      
      * remove _force_token_ids_generation from FSMT
      
      * nit
      
      * fix negative constant
      
      * apply suggestions from code review
      c130e67d
    • Julien Plu's avatar
      Fix TF LED/Longformer attentions computation (#10007) · 22a32cf4
      Julien Plu authored
      * Fix test
      
      * Remove commented test
      
      * Fix name
      
      * Apply style
      
      * Fix check copies
      
      * Remove prints
      
      * Restore boolean
      
      * Fix reshape
      22a32cf4
    • Lysandre Debut's avatar
  9. 09 Feb, 2021 3 commits
  10. 08 Feb, 2021 9 commits
    • sandip's avatar
      Integration test for electra model (#10073) · 263fac71
      sandip authored
      263fac71
    • demSd's avatar
      Implementing the test integration of BertGeneration (#9990) · 3b7e612a
      demSd authored
      * claiming this issue
      
      * Integration test for BertGeneration(Encoder and Decoder)
      
      * fix code quality
      3b7e612a
    • Patrick von Platen's avatar
      fix bert2bert test (#10063) · 9e795eac
      Patrick von Platen authored
      9e795eac
    • Julien Plu's avatar
      Restore TF embeddings and attention layers to their previous version (#9890) · 31563e05
      Julien Plu authored
      * Refacto BERT
      
      * Restore all the concerned models
      
      * Remove print
      
      * Update template
      
      * Apply Sylvain's and Morgan's comments
      
      * Fix cast
      
      * Put the cast inside call
      
      * Remove cond in ebds
      
      * Fix funnel
      
      * Restore previous dot product (attention_scores) computation
      
      * Add ConvBERT and BART
      
      * Make all the S2S models ONNX compliant
      
      * Fix test
      
      * Fix check copies
      31563e05
    • Julien Plu's avatar
      Disable temporarily too slow tests (Longformer/LED) (#10062) · 8bb52bd2
      Julien Plu authored
      * Disable temporarily too slow tests
      
      * Fix style
      
      * Fix template
      8bb52bd2
    • Nicolas Patry's avatar
      Cleaning up `ConversationalPipeline` to support more than DialoGPT. (#10002) · b1aa4982
      Nicolas Patry authored
      * Cleaning up `ConversationalPipeline` to support more than DialoGPT.
      
      Currently ConversationalPipeline was heavily biased towards DialoGPT
      ,which is the default model for this pipeline.
      
      This PR proposes changes to put back the modifications specific to
      DialoGPT into tokenizer-specific behavior wherever possible, by
      creating `_build_conversation_input_ids` function that takes
      conversation as input, and returns a list of ints corresponding
      to the tokens. It feels natural to put here because all models
      have probably different strategies to build input_ids from the
      full conversation and it's the tokenizer's job to transform strings
      into tokens (and vice-versa)
      
      If `_build_conversation_input_ids` is missing, previous behavior is
      used so we don't break anything so far (except for blenderbot where it's a fix).
      
      This PR also contains a fix for too long inputs. There used
      to be dead code for trying to limit the size of incoming input.
      The introduced fixed is that we limit
      within `_build_conversation_input_ids` to `tokenizer.model_max_length`.
      It corresponds to the intent of the removed dead code and is actually
      better because it corresponds to `model_max_length` which is different
      from `max_length` (which is a default parameter for `generate`).
      
      - Removed `history` logic from the Conversation as it's not relevant
      anymore because tokenization logic has been moved to tokenizer.
      And tokenizer cannot save any cache, and conversation cannot know
      what is relevant or not.
      Also it's not usable from `blenderbot` because the input_ids are
      not append only (EOS tokens is always at the end).
      
      - Added `iter_texts` method on `Conversation` because all
      the code was literred with some form of this iteration of
      past/generated_responses.
      
      * Removing torch mention in types.
      
      * Adding type checking to `_build_conversation_input_ids`.
      
      * Fixing import in strings.
      b1aa4982
    • Patrick von Platen's avatar
      fix bart tests (#10060) · 9a0399e1
      Patrick von Platen authored
      9a0399e1
    • Lysandre Debut's avatar
      Fix slow dpr test (#10059) · d51302cc
      Lysandre Debut authored
      * Correct cast to device
      
      * Comment back the slow test
      d51302cc
    • sandip's avatar
      Integration test for FlauBert (#10022) · 12e44af5
      sandip authored
      12e44af5
  11. 04 Feb, 2021 4 commits
  12. 03 Feb, 2021 4 commits