- 13 Nov, 2023 1 commit
-
-
pppppM authored
* update supported matrix * change the default shard size when saving quantized weights * baichuan2 kv8
-
- 03 Nov, 2023 1 commit
-
-
pppppM authored
* fix awq * adapt new qwen code * adapt qwen 14b and baichuan2 7b * add docstring * add runtime error for qwen
-
- 25 Oct, 2023 1 commit
-
-
RunningLeon authored
* add * import fire in main * wrap to speed up fire cli * update * update docs * update docs * fix * resolve commennts * resolve confict and add test for cli
-
- 24 Aug, 2023 1 commit
-
-
pppppM authored
* fix llama2 70b * fix qwen quantization * remove pdb * add faq
-
- 11 Aug, 2023 1 commit
-
-
pppppM authored
* support kv cache offload * add dataloader docstring * complete gitignore * refactor collect mod fn * add calibration * fix lint * add observers and quantizers * fix lints * add global available mixin * fix lints * split batch inference * support smoothquant and awq * update export kv scales * fix lints * fix some bugs * update weight only usage * update usage * auto mapping and support smooth internlm * trust remote code * fix num head key error * fix bias error * align shape and pack order with llm-awq * modified according to LZHgrla's comments. * update gitignore * fix kv qparams export error * update usage * decouple calibrate and awq * update docstrings * update api name * update readme * update readme * update readme * update readme * update kv_qparams and readme * fix typos
-