1. 25 Oct, 2023 1 commit
    • RunningLeon's avatar
      Add more user-friendly CLI (#541) · 169d5169
      RunningLeon authored
      * add
      
      * import fire in main
      
      * wrap to speed up fire cli
      
      * update
      
      * update docs
      
      * update docs
      
      * fix
      
      * resolve commennts
      
      * resolve confict and add test for cli
      169d5169
  2. 29 Aug, 2023 1 commit
  3. 24 Aug, 2023 1 commit
  4. 14 Aug, 2023 1 commit
  5. 11 Aug, 2023 1 commit
    • pppppM's avatar
      [Feature] Support AWQ (#108) · d3dbe179
      pppppM authored
      * support kv cache offload
      
      * add dataloader docstring
      
      * complete gitignore
      
      * refactor collect mod fn
      
      * add calibration
      
      * fix lint
      
      * add observers and quantizers
      
      * fix lints
      
      * add global available mixin
      
      * fix lints
      
      * split batch inference
      
      * support smoothquant and awq
      
      * update export kv scales
      
      * fix lints
      
      * fix some bugs
      
      * update weight only usage
      
      * update usage
      
      * auto mapping and support smooth internlm
      
      * trust remote code
      
      * fix num head key error
      
      * fix bias error
      
      * align shape and pack order with llm-awq
      
      * modified according to LZHgrla's comments.
      
      * update gitignore
      
      * fix kv qparams export error
      
      * update usage
      
      * decouple calibrate and awq
      
      * update docstrings
      
      * update api name
      
      * update readme
      
      * update readme
      
      * update readme
      
      * update readme
      
      * update kv_qparams and readme
      
      * fix typos
      d3dbe179
  6. 07 Aug, 2023 1 commit
  7. 20 Jul, 2023 1 commit
  8. 06 Jul, 2023 1 commit
  9. 05 Jul, 2023 2 commits
    • tpoisonooo's avatar
      fix(kv_qparams.py): zp use min (#59) · ec53d63f
      tpoisonooo authored
      * fix(kv_qparams.py): zp use min
      
      * revert(qparams.py): revert format
      
      * fix(kv_qparams.py): update formula
      ec53d63f
    • pppppM's avatar
      [Feature] Stats Quantization Parameters for KV Cache (#45) · 3fff964d
      pppppM authored
      * add cal qparams
      
      * support offload inference
      
      * add collect funtions (mod,weight)
      
      * stats kv scales
      
      * update init
      
      * add user guide
      
      * fix hints
      
      * fix comments & support turbomind format
      
      * update user guide
      
      * fix slice kv cache error & support pileval dataset (used in llm-awq)
      
      * fix wrong num heads slice
      
      * update default dataset
      
      * fix conflict
      
      * fix hints
      
      * fix hints
      
      * add gitignore
      3fff964d