1. 06 Aug, 2024 1 commit
    • Pablo Montalvo's avatar
      Add codestral mamba2 (#32080) · 80b90e7b
      Pablo Montalvo authored
      * add new model like
      
      * draft cuda forward - mismatched keys (sharding on conv1)
      
      * match keys successfully
      
      * fix split
      
      * get generation/forward running (wrong gens, norm?)
      
      * :update
      
      * some refactoring
      
      * fixes
      
      * works up until copy to cache
      
      * fix
      
      * update
      
      * NON WORKING VERSION
      
      * version that work?
      
      * nit
      
      * fix config
      
      * fix conversion script
      
      * working cuda forward
      
      * nit
      
      * update
      
      * simplifcation
      
      * make mamba slow simple work
      
      * no einops
      
      * todo
      
      * fix style
      
      * no einops
      
      * update fix no einsum
      
      * nit
      
      * remove einops
      
      * bug: scan_output differs strongly
      
      * add rms norm option
      
      * fix fast + slow generation with and w/o cache 
      
      
      
      * draft integration tests
      
      * remove a big chunk of the einsum
      
      * fix slow, fast generations, without any einsum
      
      * fix copies
      
      * fix structure
      
      * fix up modeling and tests
      
      * fix tests
      
      * clamping is indeed worse
      
      * recover mamba2 cache test
      
      * fix copies
      
      * no cache position (yet)
      
      * fix tf tests
      
      * fix matmul for generate
      
      * fixup
      
      * skip cache tests for now
      
      * [run-slow]mamba2
      
      * tune out hidden states for padding
      
      * test batched generation
      
      * propagate attention mask changes
      
      * fix past length
      
      * fix integration test
      
      * style
      
      * address comments
      
      * update readme
      
      * add mamba2 version check
      
      * fix tests
      
      * [run-slow]mamba2
      
      * skip edge tests
      
      * [run-slow]mamba2
      
      * last fixup
      
      * [run-slow]mamba2
      
      * update README
      
      ---------
      Co-authored-by: default avatarArthur Zucker <arthur.zucker@gmail.com>
      80b90e7b
  2. 21 May, 2024 1 commit
  3. 11 Dec, 2020 1 commit
  4. 20 Jun, 2020 1 commit
    • Kevin Canwen Xu's avatar
      Add BERT Loses Patience (Patience-based Early Exit) (#5078) · 2fd28d43
      Kevin Canwen Xu authored
      * Add BERT Loses Patience (Patience-based Early Exit)
      
      * update model archive
      
      * update format
      
      * sort import
      
      * flake8
      
      * Add results
      
      * full results
      
      * align the table
      
      * refactor to inherit
      
      * default per gpu eval = 1
      
      * Formatting
      
      * Formatting
      
      * isort
      
      * modify readme
      
      * Add check
      
      * Fix format
      
      * Fix format
      
      * Doc strings
      
      * ALBERT & BERT for sequence classification don't inherit from the original anymore
      
      * Remove incorrect comments
      
      * Remove incorrect comments
      
      * Remove incorrect comments
      
      * Sync up with new code
      
      * Sync up with new code
      
      * Add a test
      
      * Add a test
      
      * Add a test
      
      * Add a test
      
      * Add a test
      
      * Add a test
      
      * Finishing up!
      2fd28d43
  5. 03 Mar, 2020 1 commit
    • Sam Shleifer's avatar
      Summarization Examples: add Bart CNN Evaluation (#3082) · 5b396457
      Sam Shleifer authored
      * Rename and improve example
      
      * Add test
      
      * slightly faster test
      
      * style
      
      * This breaks remy prolly
      
      * shorter test string
      
      * no slow
      
      * newdir structure
      
      * New tree
      
      * Style
      
      * shorter
      
      * docs
      
      * clean
      
      * Attempt future import
      
      * more import hax
      5b396457
  6. 06 Jan, 2020 2 commits
  7. 22 Dec, 2019 1 commit
  8. 26 Sep, 2019 1 commit
  9. 05 Jul, 2019 1 commit
  10. 02 Jul, 2019 1 commit