• Lyu Han's avatar
    Support codellama (#359) · 65c662f9
    Lyu Han authored
    * tmp
    
    * add demo for codellama inference
    
    * update
    
    * update
    
    * update
    
    * update codellama.md
    
    * export rope_theta
    
    * update
    
    * update doc
    
    * fix client.py
    
    * define SamplingParam
    
    * rollback 'end'
    
    * rotary_emb_base to rotary_embedding_base
    
    * change to baichuan2-7b
    65c662f9
unfused_attention_kernels.h 6.73 KB