1. 05 Apr, 2024 13 commits
  2. 04 Apr, 2024 3 commits
  3. 03 Apr, 2024 12 commits
  4. 02 Apr, 2024 11 commits
    • Mario Šaško's avatar
    • Joao Gante's avatar
      Generate: fix logits processors doctests (#29718) · 5080ab12
      Joao Gante authored
      * fix norm
      
      * fix logits processors doctests
      5080ab12
    • Nicolas Patry's avatar
      Hard error when ignoring tensors. (#27484) (#29906) · 9b0a8ea7
      Nicolas Patry authored
      
      
      * Hard error when ignoring tensors. (#27484)
      
      * [WIP] Hard error when ignoring tensors.
      
      * Better selection/error when saving a checkpoint.
      
      - Find all names we should normally drop (those are in the transformers
        config)
      - Find all disjoint tensors (for those we can safely trigger a copy to
        get rid of the sharing before saving)
      - Clone those disjoint tensors getting rid of the issue
      - Find all identical names (those should be declared in the config
        but we try to find them all anyway.)
      - For all identical names:
        - If they are in the config, just ignore them everything is fine
        - If they are not, warn about them.
      - For all remainder tensors which are shared yet neither identical NOR
        disjoint. raise a hard error.
      
      * Adding a failing test on `main` that passes here.
      
      * We don't need to keep the subfolder logic in this test.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Add small tests.
      
      * Dead variable.
      
      * Fixup.
      
      * Fixing tied_Weights_keys on generic models.
      
      * Fixup + T5 encoder/decoder tying (with different layers)
      
      * Code quality.
      
      * Dynamic member.
      
      * trigger
      
      * Fixing encoder name for other types of encoder/decoder combos.
      
      * Fix scoping.
      
      * Update .github/workflows/self-scheduled.yml
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Fixing the tied_weights after the call.
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      9b0a8ea7
    • Minsub Lee (Matt)'s avatar
      Fix `skip_special_tokens` for `Wav2Vec2CTCTokenizer._decode` (#29311) · 15cd6871
      Minsub Lee (Matt) authored
      * Fix skip_special_tokens process for Wav2Vec2CTCTokenizer._decode
      
      * Fix skip_special_tokens for Wav2Vec2CTCTokenizer._decode
      
      * Exclude pad_token filtering since it is used as CTC-blank token
      
      * Add small test for skip_special_tokens
      
      * Update decoding test for added new token
      15cd6871
    • Michael's avatar
    • Yoach Lacombe's avatar
      Add Flash Attention 2 support to Musicgen and Musicgen Melody (#29939) · 0d04b1e2
      Yoach Lacombe authored
      * add FA2 to o.g Musicgen
      
      * make style
      
      * add FA2 support to Musicgen Melody
      
      * add generation FA2 tests to o.g Musicgen
      
      * make style and fix copies
      
      * add Musicgen to FA2 docs + deprecate list
      
      * add sdpa supports to Musicgen's
      
      * make style and fix copies
      
      * refactor attention implementation arguments
      
      * add Copied from to sdpa tests
      
      * add copied form in sdpa tests melody
      
      * add copied for FA2 generation tests
      
      * add FA2 inference copied from
      
      * make style
      0d04b1e2
    • théo gigant's avatar
      Adding FlaxNoRepeatNGramLogitsProcessor (#29677) · fed27ffc
      théo gigant authored
      * fix issue with logit processor in beam search in Flax
      
      * adding FlaxNoRepeatNGramLogitsProcessor class + unit test
      
      * style correction and code verification
      
      * add FlaxNoRepeatNGramLogitsProcessor to the test_processor_list and test_processor_list_jitted tests
      
      * fix an issue where ngrams are banned only if they appear ==1 time + update description of get_previous_ngrams
      
      * replace non-jit compatible masking of ngrams that are not yet generated with jittable version
      
      * Revert "fix issue with logit processor in beam search in Flax"
      
      This reverts commit 09b70d7e4dc32d0cc4db61af09a835a9cd238b50.
      
      * add FlaxNoRepeatNGramLogitsProcessor to _get_logits_processor
      
      * change the method of casting to boolean of banned tokens indices
      
      * fix code style
      
      * remove some useless operations + significantly faster computation of update indices using jax.lax.fori_loop
      
      * remove useless loop iterations
      
      * set some variables that were calculated and used multiple times
      
      * fix format
      fed27ffc
    • Marc Sun's avatar
      [bnb] Fix bug in `_replace_with_bnb_linear` (#29958) · 33288ff1
      Marc Sun authored
      fix bug
      33288ff1
    • Hovnatan Karapetyan's avatar
      Fix 29807 sinusoidal positional encodings in Flaubert, Informer and XLM (#29904) · 416711c3
      Hovnatan Karapetyan authored
      * Fix sinusoidal_embeddings in FlaubertModel
      
      * Fix for Informer
      
      * Fix for XLM
      
      * Move sinusoidal emb for XLM
      
      * Move sinusoidal emb for Flaubert
      
      * Small cleanup
      
      * Add comments on tests code copied from
      
      * Add with Distilbert->
      416711c3
    • Arthur's avatar
      [`generate`] fix breaking change for patch (#29976) · 83b26dd7
      Arthur authored
      * fix bug and add tests
      
      * nit
      
      * otherway to get the cur len instead of attention mask
      
      * more places where this might have been broken
      
      * nit
      
      * oups
      
      * inputs_embeds vs input_embeds
      
      * test generated outptus
      
      * style
      
      * nit
      
      * fix
      
      * skip failing biogpt
      83b26dd7
    • Steven Liu's avatar
      [docs] Big model loading (#29920) · 096f3046
      Steven Liu authored
      * update
      
      * feedback
      096f3046
  5. 01 Apr, 2024 1 commit