- 25 Oct, 2023 1 commit
-
-
RunningLeon authored
* add * import fire in main * wrap to speed up fire cli * update * update docs * update docs * fix * resolve commennts * resolve confict and add test for cli
-
- 14 Aug, 2023 2 commits
-
-
Lyu Han authored
* tmp * update * update * update * update * update * remove * update * update
-
Li Zhang authored
* add w4a16 * fix `deploy.py` * add doc * add w4a16 kernels * fuse w1/w3 & bugfixes * fix typo * python * guard sm75/80 features * add missing header * refactor * qkvo bias * update cost model * fix lint * update `deploy.py`
-
- 23 Jul, 2023 1 commit
-
-
lvhan028 authored
* refactor model.py and support baichuan-7b * remove model_name * remove hard session_len * export tokenizer.py to target dir * remove model_name from client * remove model_name * update * correct throughput equation * fix session.response * update serving.md * update readme * update according to review comments * update * update * update * update
-
- 17 Jul, 2023 1 commit
-
-
Jaylin Lee authored
* [bugfix] Fix some docs' bug in 'serving' * [bugfix] Fix some docs' bug in 'serving'
-
- 13 Jul, 2023 1 commit
-
-
del-zhenwu authored
-
- 11 Jul, 2023 2 commits
-
-
tpoisonooo authored
* docs(serving.md): typo * docs(README): quantization
-
q.yao authored
* update contrib * update links
-
- 05 Jul, 2023 1 commit
-
-
lvhan028 authored
* add performance * use png * update * update * update * update * update
-