1. 17 Aug, 2023 1 commit
  2. 14 Aug, 2023 1 commit
  3. 05 Jul, 2023 4 commits
    • tpoisonooo's avatar
      fix(kv_qparams.py): zp use min (#59) · ec53d63f
      tpoisonooo authored
      * fix(kv_qparams.py): zp use min
      
      * revert(qparams.py): revert format
      
      * fix(kv_qparams.py): update formula
      ec53d63f
    • tpoisonooo's avatar
      docs(README): typo (#56) · 7396d8f6
      tpoisonooo authored
      7396d8f6
    • pppppM's avatar
      [Feature] Stats Quantization Parameters for KV Cache (#45) · 3fff964d
      pppppM authored
      * add cal qparams
      
      * support offload inference
      
      * add collect funtions (mod,weight)
      
      * stats kv scales
      
      * update init
      
      * add user guide
      
      * fix hints
      
      * fix comments & support turbomind format
      
      * update user guide
      
      * fix slice kv cache error & support pileval dataset (used in llm-awq)
      
      * fix wrong num heads slice
      
      * update default dataset
      
      * fix conflict
      
      * fix hints
      
      * fix hints
      
      * add gitignore
      3fff964d
    • tpoisonooo's avatar
      docs(quantization): add more test (#53) · edb6eb86
      tpoisonooo authored
      * docs(quantization): add more test
      
      * revert(generate.sh): revert ninja
      
      * revert(llama_config.ini): revert empty line
      
      * fix(quantization.md): fix link error
      edb6eb86
  4. 04 Jul, 2023 2 commits