1. 23 May, 2024 1 commit
    • Aritra Roy Gosthipaty's avatar
      [Port] TensorFlow implementation of Mistral (#29708) · 965e98dc
      Aritra Roy Gosthipaty authored
      
      
      * chore: initial commit
      
      * chore: adding imports and inits
      
      * chore: adding the causal and classification code
      
      * chore: adding names to the layers
      
      * chore: using single self attn layer
      
      * chore: built the model and layers
      
      * chore: start with testing
      
      * chore: docstring change, transpose fix
      
      * fix: rotary embedding
      
      * chore: adding cache implementation
      
      * remove unused torch
      
      * chore: fixing the indexing issue
      
      * make fix-copies
      
      * Use modeling_tf_utils.keras
      
      * make fixup
      
      * chore: fixing tests
      
      * chore: adding past key value logic
      
      * chore: adding multi label classfication test
      
      * fix: switching on the built parameters in the layers
      
      * fixing repo consistency
      
      * ruff formats
      
      * style changes
      
      * fix: tf and pt equivalence
      
      * removing returns from docstrings
      
      * fix docstrings
      
      * fix docstrings
      
      * removing todos
      
      * fix copies
      
      * fix docstring
      
      * fix docstring
      
      * chore: using easier rotate_half
      
      * adding integration tests
      
      * chore: addressing review related to rotary embedding layer
      
      * review changes
      
      * [run-slow] mistral
      
      * skip: test save load after resize token embedding
      
      * style
      
      ---------
      Co-authored-by: default avatarMatt <rocketknight1@gmail.com>
      965e98dc