1. 10 Apr, 2023 1 commit
    • Joel Lamy-Poirier's avatar
      Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575) · e0921c6b
      Joel Lamy-Poirier authored
      
      
      * Add model with cli tool
      
      * Remove unwanted stuff
      
      * Add new code
      
      * Remove inference runner
      
      * Style
      
      * Fix checks
      
      * Test updates
      
      * make fixup
      
      * fix docs
      
      * fix doc
      
      * fix test
      
      * hopefully fix pipeline tests
      
      * refactor
      
      * fix CIs
      
      * add comment
      
      * rename to `GPTBigCodeForCausalLM`
      
      * correct readme
      
      * make fixup + docs
      
      * make fixup
      
      * fixes
      
      * fixes
      
      * Remove pruning
      
      * Remove import
      
      * Doc updates
      
      * More pruning removal
      
      * Combine copies
      
      * Single MQA implementation, remove kv cache pre-allocation and padding
      
      * Update doc
      
      * Revert refactor to match gpt2 style
      
      * Merge back key and value caches, fix some type hints
      
      * Update doc
      
      * Fix position ids pith padding (PR 21080)
      
      * Add conversion script temporarily
      
      * Update conversion script
      
      * Remove checkpoint conversion
      
      * New model
      
      * Fix MQA test
      
      * Fix copies
      
      * try fix tests
      
      * FIX TEST!!
      
      * remove  `DoubleHeadsModel`
      
      * add MQA tests
      
      * add slow tests
      
      * clean up
      
      * add CPU checker
      
      * final fixes
      
      * fixes
      
      - fix GPU issue
      - fixed slow tests
      - skip disk offload
      
      * fix final issue
      
      * Simplify and comment baddbmm fix
      
      * Remove unnecessary code
      
      * Transpose tweaks
      
      * Use beta=1 on cpu, improve tests
      
      ---------
      Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
      e0921c6b
  2. 07 Apr, 2023 11 commits
  3. 06 Apr, 2023 13 commits
  4. 05 Apr, 2023 13 commits
  5. 04 Apr, 2023 2 commits
    • Matt's avatar
      Fix inverted conditional in TF common test! (#22540) · edb704b2
      Matt authored
      * Fix inverted conditional in TF common test!
      
      * Make the same change in the PT tests file
      
      * Make sure hidden states for GPT2 have the same output shape in PT/TF
      
      * Minor fix to PT implementation of token classification loss
      
      * Skip loss equivalence test for TFHubert because it keeps overflowing to inf
      
      * Compute LM loss for TF the (weird) way it's computed in PT
      
      * Skip loss equivalence test for Wav2Vec2 for the same reason as Hubert
      
      * Fix - don't try to access the hidden states property when output is a tuple
      edb704b2
    • Sourab Mangrulkar's avatar
      48fbd8fa