1. 06 Jul, 2023 1 commit
  2. 05 Jul, 2023 2 commits
    • tpoisonooo's avatar
      fix(kv_qparams.py): zp use min (#59) · ec53d63f
      tpoisonooo authored
      * fix(kv_qparams.py): zp use min
      
      * revert(qparams.py): revert format
      
      * fix(kv_qparams.py): update formula
      ec53d63f
    • pppppM's avatar
      [Feature] Stats Quantization Parameters for KV Cache (#45) · 3fff964d
      pppppM authored
      * add cal qparams
      
      * support offload inference
      
      * add collect funtions (mod,weight)
      
      * stats kv scales
      
      * update init
      
      * add user guide
      
      * fix hints
      
      * fix comments & support turbomind format
      
      * update user guide
      
      * fix slice kv cache error & support pileval dataset (used in llm-awq)
      
      * fix wrong num heads slice
      
      * update default dataset
      
      * fix conflict
      
      * fix hints
      
      * fix hints
      
      * add gitignore
      3fff964d