• Joao Gante's avatar
    Generate: end-to-end compilation (#30788) · 7ffe25f2
    Joao Gante authored
    * mvp
    
    * added test (a few models need fixes)
    
    * fix a few test cases
    
    * test nits
    
    * harder test 馃槇
    
    * revert changes in stablelm
    
    * test with improved condition
    
    * add todo
    
    * tmp commit
    
    * merged with main
    
    * nits
    
    * add todo
    
    * final corrections
    
    * add docs for generation compilation
    
    * docs nits
    
    * add  tip
    
    * PR suggestions
    
    * add more details to the compilation docs
    
    * fix cache positions
    
    * cache is now init in generate; update docs
    
    * tag test as flaky
    
    * docs
    
    * post rebase make fixup and other nits
    
    * remove unintended changes
    
    * whisper (encoder-decoder) not supported
    
    * move token default updates to ; add tests for token defaults
    
    * push changes
    
    * manual rebase
    
    * chameleon doesn't support this
    
    * fix test_static_cache_mha_mqa_gqa (broken in another PR)
    
    * docs: dynamic is better with end-to-end compilation
    7ffe25f2
test_modeling_chameleon.py 18.6 KB