• pppppM's avatar
    [Feature] Stats Quantization Parameters for KV Cache (#45) · 3fff964d
    pppppM authored
    * add cal qparams
    
    * support offload inference
    
    * add collect funtions (mod,weight)
    
    * stats kv scales
    
    * update init
    
    * add user guide
    
    * fix hints
    
    * fix comments & support turbomind format
    
    * update user guide
    
    * fix slice kv cache error & support pileval dataset (used in llm-awq)
    
    * fix wrong num heads slice
    
    * update default dataset
    
    * fix conflict
    
    * fix hints
    
    * fix hints
    
    * add gitignore
    3fff964d
profile_generation.py 4.84 KB