1. 09 May, 2023 1 commit
    • Sylvain Gugger's avatar
      Add RWKV-4 (#22797) · b4d4d6fe
      Sylvain Gugger authored
      
      
      * First draft of RWKV-4
      
      * Add support for generate
      
      * Style post-rebase
      
      * Properly use state
      
      * Write doc
      
      * Fix doc
      
      * More math
      
      * Add model to README, dummies and clean config
      
      * Fix init
      
      * multiple fixes:
      
      - fix common tests
      - fix configuraion default values
      - add CI test for checking state computation
      - fix some CI tests
      
      * correct tokenizer
      
      * some tweaks
      
      - fix config docstring
      - fix failing tests
      
      * fix CI tests
      
      - add output_attention / output_hidden_states
      - override test_initialization
      - fix failing CIs
      
      * fix conversion script
      
      - fix sharded case
      - add new arguments
      
      * add slow tests + more fixes on conversion script
      
      * add another test
      
      * final fixes
      
      * change single name variable
      
      * add mock attention mask for pipeline to work
      
      * correct eos token id
      
      * fix nits
      
      * add checkpoints
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * add `tie_word_embeddings` in docstring
      
      * change tensor name
      
      * fix final nits
      
      * Trigger CI
      
      ---------
      Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
      Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      b4d4d6fe
  2. 11 Dec, 2020 1 commit
  3. 20 Jun, 2020 1 commit
    • Kevin Canwen Xu's avatar
      Add BERT Loses Patience (Patience-based Early Exit) (#5078) · 2fd28d43
      Kevin Canwen Xu authored
      * Add BERT Loses Patience (Patience-based Early Exit)
      
      * update model archive
      
      * update format
      
      * sort import
      
      * flake8
      
      * Add results
      
      * full results
      
      * align the table
      
      * refactor to inherit
      
      * default per gpu eval = 1
      
      * Formatting
      
      * Formatting
      
      * isort
      
      * modify readme
      
      * Add check
      
      * Fix format
      
      * Fix format
      
      * Doc strings
      
      * ALBERT & BERT for sequence classification don't inherit from the original anymore
      
      * Remove incorrect comments
      
      * Remove incorrect comments
      
      * Remove incorrect comments
      
      * Sync up with new code
      
      * Sync up with new code
      
      * Add a test
      
      * Add a test
      
      * Add a test
      
      * Add a test
      
      * Add a test
      
      * Add a test
      
      * Finishing up!
      2fd28d43
  4. 03 Mar, 2020 1 commit
    • Sam Shleifer's avatar
      Summarization Examples: add Bart CNN Evaluation (#3082) · 5b396457
      Sam Shleifer authored
      * Rename and improve example
      
      * Add test
      
      * slightly faster test
      
      * style
      
      * This breaks remy prolly
      
      * shorter test string
      
      * no slow
      
      * newdir structure
      
      * New tree
      
      * Style
      
      * shorter
      
      * docs
      
      * clean
      
      * Attempt future import
      
      * more import hax
      5b396457
  5. 06 Jan, 2020 2 commits
  6. 22 Dec, 2019 1 commit
  7. 26 Sep, 2019 1 commit
  8. 05 Jul, 2019 1 commit
  9. 02 Jul, 2019 1 commit