1. 13 Jun, 2022 2 commits
    • Daniel Stancl's avatar
      Add `LongT5` model (#16792) · a72f1c9f
      Daniel Stancl authored
      
      
      * Initial commit
      
      * Make some fixes
      
      * Make PT model full forward pass
      
      * Drop TF & Flax implementation, fix copies etc
      
      * Add Flax model and update some corresponding stuff
      
      * Drop some TF things
      
      * Update config and flax local attn
      
      * Add encoder_attention_type to config
      
      * .
      
      * Update docs
      
      * Do some cleansing
      
      * Fix some issues -> make style; add some docs
      
      * Fix position_bias + mask addition + Update tests
      
      * Fix repo consistency
      
      * Fix model consistency by removing flax operation over attn_mask
      
      * [WIP] Add PT TGlobal LongT5
      
      * .
      
      * [WIP] Add flax tglobal model
      
      * [WIP] Update flax model to use the right attention type in the encoder
      
      * Fix flax tglobal model forward pass
      
      * Make the use of global_relative_attention_bias
      
      * Add test suites for TGlobal model
      
      * Fix minor bugs, clean code
      
      * Fix pt-flax equivalence though not convinced with correctness
      
      * Fix LocalAttn implementation to match the original impl. + update READMEs
      
      * Few updates
      
      * Update: [Flax] improve large model init and loading #16148
      
      * Add ckpt conversion script accoring to #16853 + handle torch device placement
      
      * Minor updates to conversion script.
      
      * Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM
      
      * gpu support + dtype fix
      
      * Apply some suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * * Remove (de)parallelize stuff
      * Edit shape comments
      * Update README.md
      * make fix-copies
      
      * Remove caching logic for local & tglobal attention
      
      * Apply another batch of suggestions from code review
      
      * Add missing checkpoints
      * Format converting scripts
      * Drop (de)parallelize links from longT5 mdx
      
      * Fix converting script + revert config file change
      
      * Revert "Remove caching logic for local & tglobal attention"
      
      This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46.
      
      * Stash caching logic in Flax model
      
      * Make side relative bias used always
      
      * Drop caching logic in PT model
      
      * Return side bias as it was
      
      * Drop all remaining model parallel logic
      
      * Remove clamp statements
      
      * Move test files to the proper place
      
      * Update docs with new version of hf-doc-builder
      
      * Fix test imports
      
      * Make some minor improvements
      
      * Add missing checkpoints to docs
      * Make TGlobal model compatible with torch.onnx.export
      * Replace some np.ndarray with jnp.ndarray
      
      * Fix TGlobal for ONNX conversion + update docs
      
      * fix _make_global_fixed_block_ids and masked neg  value
      
      * update flax model
      
      * style and quality
      
      * fix imports
      
      * remove load_tf_weights_in_longt5 from init and fix copies
      
      * add slow test for TGlobal model
      
      * typo fix
      
      * Drop obsolete is_parallelizable and one warning
      
      * Update __init__ files to fix repo-consistency
      
      * fix pipeline test
      
      * Fix some device placements
      
      * [wip]: Update tests -- need to generate summaries to update expected_summary
      
      * Fix quality
      
      * Update LongT5 model card
      
      * Update (slow) summarization tests
      
      * make style
      
      * rename checkpoitns
      
      * finish
      
      * fix flax tests
      Co-authored-by: default avatarphungvanduy <pvduy23@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      a72f1c9f
    • Sijun He's avatar
      Add Visual Question Answering (VQA) pipeline (#17286) · 66336dc1
      Sijun He authored
      
      
      * wip
      
      * rebase
      
      * all tests pass
      
      * rebase
      
      * ready for PR
      
      * address comments
      
      * fix styles
      
      * add require_torch to pipeline test
      
      * remove remote image to improve CI consistency
      
      * address comments; fix tf/flax tests
      
      * address comments; fix tf/flax tests
      
      * fix tests; add alias
      
      * repo consistency tests
      
      * Update src/transformers/pipelines/visual_question_answering.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * address comments
      
      * Update src/transformers/pipelines/visual_question_answering.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * merge
      
      * Update src/transformers/models/auto/modeling_auto.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * merge
      Co-authored-by: default avatarSijun He <sijunhe@Sijuns-MacBook-Pro.local>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      66336dc1
  2. 09 Jun, 2022 2 commits
    • Nicolas Patry's avatar
      Running a pipeline of `float16`. (#17637) · c38f4e1f
      Nicolas Patry authored
      When we're preparing the tensors for CPU for postprocessing, we need
      to upgrade the `float16` to `float32` since CPUs don't have instructions
      for `[b]float16`.
      c38f4e1f
    • Nicolas Patry's avatar
      Adding `top_k` argument to `text-classification` pipeline. (#17606) · 2351729f
      Nicolas Patry authored
      * Adding `top_k` and `sort` arguments to `text-classification` pipeline.
      
      - Deprecate `return_all_scores` as `top_k` is more uniform with other
        pipelines, and a superset of what `return_all_scores` can do.
        BC is maintained though.
        `return_all_scores=True` -> `top_k=None`
        `return_all_scores=False` -> `top_k=1`
      
      - Using `top_k` will imply sorting the results, but using no argument
        will keep the results unsorted for backward compatibility.
      
      * Remove `sort`.
      
      * Fixing the test.
      
      * Remove bad doc.
      2351729f
  3. 19 May, 2022 2 commits
  4. 18 May, 2022 1 commit
  5. 12 May, 2022 1 commit
  6. 10 May, 2022 1 commit
  7. 05 May, 2022 1 commit
  8. 21 Apr, 2022 1 commit
  9. 20 Apr, 2022 1 commit
  10. 14 Apr, 2022 1 commit
  11. 12 Apr, 2022 1 commit
    • Nicolas Patry's avatar
      Change the chunk_iter function to handle (#16730) · a192f61e
      Nicolas Patry authored
      * Change the chunk_iter function to handle
      
      the subtle cases where the last chunk gets ignored since all the
      data is in the `left_strided` data.
      
      We need to remove the right striding on the previous item.
      
      * Remove commented line.
      a192f61e
  12. 18 Mar, 2022 1 commit
  13. 09 Mar, 2022 1 commit
  14. 04 Mar, 2022 2 commits
  15. 03 Mar, 2022 2 commits
  16. 02 Mar, 2022 1 commit
  17. 28 Feb, 2022 1 commit
  18. 25 Feb, 2022 2 commits
  19. 23 Feb, 2022 1 commit