• Suraj Patil's avatar
    FlaxGPTJ (#14396) · 4c0dd199
    Suraj Patil authored
    * add flax gptj
    
    * no bias in attention dense
    
    * no wpe
    
    * fix rotary embeddings
    
    * fix rotary embeds
    
    * fix rotray embeds
    
    * quality
    
    * doc and quality
    
    * fix equivalence tests
    4c0dd199
test_modeling_flax_gptj.py 14.2 KB