1. 07 Jun, 2021 1 commit
  2. 04 Jun, 2021 1 commit
  3. 02 Jun, 2021 4 commits
  4. 01 Jun, 2021 6 commits
  5. 31 May, 2021 1 commit
  6. 28 May, 2021 2 commits
  7. 27 May, 2021 1 commit
    • Nicolas Patry's avatar
      Adding new argument `max_new_tokens` for generate. (#11476) · 80d712fa
      Nicolas Patry authored
      * Adding new argument `max_new_tokens` for generate.
      
      This is a proposal to add a new argument `max_new_tokens` to `generate`.
      This include a `MaxNewTokensCriteria` that enables callers that don't
      know about the token length ahead (like pipelines callers) to manage
      more easily the length of their generated output.
      
      * Adding a test for the user warning when both`max_length` and
      `max_new_tokens` are used together.
      
      * Removed redundant `no_grad`.
      80d712fa
  8. 26 May, 2021 3 commits
    • Patrick von Platen's avatar
      Flax Generate (#11777) · 996a315e
      Patrick von Platen authored
      
      
      * fix_torch_device_generate_test
      
      * remove @
      
      * add
      
      * indexing
      
      * correct a couple of tests
      
      * fix tests
      
      * add logits processor
      
      * finish top_k, top_p, temp
      
      * add docs
      
      * correct flax prng key default
      
      * improve generate
      
      * add generation docs
      
      * add docs
      
      * make style
      
      * revert model outputs change
      
      * make style
      
      * correct typo
      
      * fix tests
      
      * fix slow test
      
      * add raise
      
      * finish generation
      Co-authored-by: default avatarPatrick von Platen <patrick@huggingface.co>
      996a315e
    • Patrick von Platen's avatar
      [Flax] Allow dataclasses to be jitted (#11886) · d5a72b6e
      Patrick von Platen authored
      * fix_torch_device_generate_test
      
      * remove @
      
      * change dataclasses to flax ones
      
      * fix typo
      
      * fix jitted tests
      
      * fix bert & electra
      d5a72b6e
    • Daniel Stancl's avatar
      Fix usage of head masks by TF encoder-decoder models' `generate()` function (#11775) · 0b933584
      Daniel Stancl authored
      * Fix Bart
      
      * Fix Blenderbot{,_small}
      
      * Fix LED
      
      * Fix Marian
      
      * Fix MBart
      
      * Fix Pegasus
      
      * Fix T5
      
      * Add test for generation with head_mask
      
      * Add a common TF test
      
      * Override a test for the LED model as head masking is not yet properly implemented
      
      * Remove all head_masks from input preparation for LED
      
      * Drop masking for T5 as it needs a bit of refactor
      0b933584
  9. 25 May, 2021 5 commits
  10. 24 May, 2021 1 commit
  11. 21 May, 2021 2 commits
  12. 20 May, 2021 4 commits
  13. 19 May, 2021 1 commit
  14. 18 May, 2021 5 commits
    • Daniel Stancl's avatar
      Fix usage of head masks by PT encoder-decoder models' `generate()` function (#11621) · 680d181c
      Daniel Stancl authored
      * Add missing head masking for generate() function
      
      * Add head_mask, decoder_head_mask and cross_attn_head_mask
      into prepare_inputs_for_generation for generate() function
      for multiple encoder-decoder models.
      
      * Add test_genereate_with_head_masking
      
      * [WIP] Update the new test and handle special cases
      
      * make style
      
      * Omit ProphetNet test so far
      
      * make fix-copies
      680d181c
    • Suraj Patil's avatar
      FlaxGPT2 (#11556) · ca33278f
      Suraj Patil authored
      
      
      * flax gpt2
      
      * combine masks
      
      * handle shared embeds
      
      * add causal LM sample
      
      * style
      
      * add tests
      
      * style
      
      * fix imports, docs, quality
      
      * don't use cache
      
      * add cache
      
      * add cache 1st version
      
      * make use cache work
      
      * start adding test for generation
      
      * finish generation loop compilation
      
      * rewrite test
      
      * finish
      
      * update
      
      * update
      
      * apply sylvains suggestions
      
      * update
      
      * refactor
      
      * fix typo
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      ca33278f
    • Vyom Pathak's avatar
      Fixed: Better names for nlp variables in pipelines' tests and docs. (#11752) · fd3b12e8
      Vyom Pathak authored
      * Fixed: Better names for nlp variables in pipelines' tests and docs.
      
      * Fixed: Better variable names
      fd3b12e8
    • Sylvain Gugger's avatar
      Fix checkpoint deletion (#11748) · a515caa3
      Sylvain Gugger authored
      a515caa3
    • Nicolas Patry's avatar
      [TokenClassification] Label realignment for subword aggregation (#11680) · b88e0e01
      Nicolas Patry authored
      * [TokenClassification] Label realignment for subword aggregation
      
      Tentative to replace https://github.com/huggingface/transformers/pull/11622/files
      
      
      
      - Added `AggregationStrategy`
      - `ignore_subwords` and `grouped_entities` arguments are now fused
        into `aggregation_strategy`. It makes more sense anyway because
        `ignore_subwords=True` with `grouped_entities=False` did not have a
        meaning anyway.
      - Added 2 new ways to aggregate which are MAX, and AVERAGE
      - AVERAGE requires a bit more information than the others, for now this
      case is slightly specific, we should keep that in mind for future
      changes.
      - Testing has been modified to reflect new argument, and to check the
      correct deprecation and the new aggregation_strategy.
      - Put the testing argument and testing results for aggregation_strategy,
      close together, so that readers can understand what is supposed to
      happen.
      - `aggregate` is now only tested on a small model as it does not mean
      anything to test it globally for all models.
      - Previous tests are unchanged in desired output.
      - Added a new test case that showcases better the difference between the
        FIRST, MAX and AVERAGE strategies.
      
      * Wrong framework.
      
      * Addressing three issues.
      
      1- Tags might not follow B-, I- convention, so any tag should work now
      (assumed as B-TAG)
      2- Fixed an issue with average that leads to a substantial code change.
      3- The testing suite was not checking for the "index" key for "none"
      strategy. This is now fixed.
      
      The issue is that "O" could not be chosen by AVERAGE strategy because
      those tokens were filtered out beforehand, so their relative scores were
      not counted in the average. Now filtering on
      ignore_labels will happen at the very end of the pipeline fixing
      that issue.
      It's a bit hard to make sure this stays like that because we do
      not have a end-to-end test for that behavior
      
      * Formatting.
      
      * Adding formatting to code + cleaner handling of B-, I- tags.
      Co-authored-by: default avatarFrancesco Rubbo <rubbo.francesco@gmail.com>
      Co-authored-by: default avatarelk-cloner <rezakakhki.rk@gmail.com>
      
      * Typo.
      Co-authored-by: default avatarFrancesco Rubbo <rubbo.francesco@gmail.com>
      Co-authored-by: default avatarelk-cloner <rezakakhki.rk@gmail.com>
      b88e0e01
  15. 17 May, 2021 1 commit
  16. 14 May, 2021 1 commit
  17. 13 May, 2021 1 commit