• Arthur's avatar
    [`Llama ROPE`] Fix torch export but also slow downs in forward (#29198) · 8a8a0a4a
    Arthur authored
    * remove control flow
    
    * update gptneox
    
    * update ....
    
    * nits
    
    * Actually let's just break. Otherwise we are silently failing which imo is not optimal
    
    * version BC
    
    * fix tests
    
    * fix eager causal
    
    * nit
    
    * add a test
    
    * style
    
    * nits
    
    * nits
    
    * more nits for the test
    
    * update and fix
    
    * make sure cuda graphs are not skipped
    
    * read token is needed for meta llama
    
    * update!
    
    * fiixup
    
    * compile test should be slow
    
    * fix thet fix copies
    
    * stle 馃珷
    8a8a0a4a
test_modeling_llama.py 33.5 KB