"tests/test_modeling_tf_distilbert.py" did not exist on "69bff89935a74cf429bd482543cf206c9c27be2f"
  • Sylvain Gugger's avatar
    Add RWKV-4 (#22797) · b4d4d6fe
    Sylvain Gugger authored
    
    
    * First draft of RWKV-4
    
    * Add support for generate
    
    * Style post-rebase
    
    * Properly use state
    
    * Write doc
    
    * Fix doc
    
    * More math
    
    * Add model to README, dummies and clean config
    
    * Fix init
    
    * multiple fixes:
    
    - fix common tests
    - fix configuraion default values
    - add CI test for checking state computation
    - fix some CI tests
    
    * correct tokenizer
    
    * some tweaks
    
    - fix config docstring
    - fix failing tests
    
    * fix CI tests
    
    - add output_attention / output_hidden_states
    - override test_initialization
    - fix failing CIs
    
    * fix conversion script
    
    - fix sharded case
    - add new arguments
    
    * add slow tests + more fixes on conversion script
    
    * add another test
    
    * final fixes
    
    * change single name variable
    
    * add mock attention mask for pipeline to work
    
    * correct eos token id
    
    * fix nits
    
    * add checkpoints
    
    * Apply suggestions from code review
    Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * add `tie_word_embeddings` in docstring
    
    * change tensor name
    
    * fix final nits
    
    * Trigger CI
    
    ---------
    Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
    Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
    Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
    b4d4d6fe
README.md 87.1 KB