Commits · f44ef17c7b02db0fde6b0877ac850584168338c8 · OpenDAS / Lmdeploy

21 Aug, 2023 3 commits
- docs(quantization): update description (#272) · f44ef17c
  tpoisonooo authored Aug 21, 2023
  
  f44ef17c
- add readthedocs (#208) · c238f1cd
  RunningLeon authored Aug 21, 2023
```
* add readthedocs configs

* update readme

* fix link

* update

* remove turbomind in api

* update

* fix comment and remove api
```
  c238f1cd
- Check-in FAQ (#256) · 2f29a3c7
  Lyu Han authored Aug 21, 2023
```
* Check-in FAQ

* update

* update
```
  2f29a3c7
17 Aug, 2023 1 commit

docs(quantzation): update description (#253) · 903707b5

tpoisonooo authored Aug 17, 2023

* Update quantization.md

* docs(quantization): update description

* docs(README): rename quantization files

903707b5

15 Aug, 2023 1 commit
- Remove specified version in user guide (#241) · e68a1d00
  Lyu Han authored Aug 15, 2023
  
  e68a1d00
14 Aug, 2023 2 commits
- Check-in user guide for w4a16 LLM deployment (#224) · 8e8629de
  Lyu Han authored Aug 14, 2023
```
* tmp

* update

* update

* update

* update

* update

* remove

* update

* update
```
  8e8629de
- feat(quantization): kv cache use asymmetric (#218) · 902a3e16
  tpoisonooo authored Aug 14, 2023
```
* feat(quantization): kv cache use asymmetric
```
  902a3e16
04 Aug, 2023 1 commit

Support serving with gradio without communicating to TIS (#162) · 18c386d9

AllentDan authored Aug 04, 2023



* use local model for webui

* local model for app.py

* lint

* remove print

* add seed

* comments

* fixed seesion_id

* support turbomind batch inference

* update app.py

* lint and docstring

* move webui to serve/gradio

* update doc

* update doc

* update docstring and rmeove print conversition

* log

* Update docs/zh_cn/build.md
Co-authored-by: Chen Xin <xinchen.tju@gmail.com>

* Update docs/en/build.md
Co-authored-by: Chen Xin <xinchen.tju@gmail.com>

* use latest gradio

* fix

* replace partial with InterFace

* use host ip instead of coolie

---------
Co-authored-by: Chen Xin <xinchen.tju@gmail.com>

18c386d9

03 Aug, 2023 1 commit
- [Docs] Translate turbomind.md into Chinese (#173) · 5545bbc5
  Xin Li authored Aug 03, 2023
```
* translate turbomind

* keep persistent batching

* revised

* revise
```
  5545bbc5
23 Jul, 2023 1 commit

Refactor the chat template of supported models using factory pattern (#144) · 7b470f07

lvhan028 authored Jul 23, 2023

* refactor model.py and support baichuan-7b

* remove model_name

* remove hard session_len

* export tokenizer.py to target dir

* remove model_name from client

* remove model_name

* update

* correct throughput equation

* fix session.response

* update serving.md

* update readme

* update according to review comments

* update

* update

* update

* update

7b470f07

17 Jul, 2023 1 commit
- [bugfix] Fix some docs' bug in 'serving' (#109) · 169d8c7f
  Jaylin Lee authored Jul 17, 2023
```
* [bugfix] Fix some docs' bug in 'serving'

* [bugfix] Fix some docs' bug in 'serving'
```
  169d8c7f
11 Jul, 2023 1 commit
- docs(serving.md): typo (#92) · 4db08045
  tpoisonooo authored Jul 11, 2023
```
* docs(serving.md): typo

* docs(README): quantization
```
  4db08045
05 Jul, 2023 5 commits

improve readme (#52) · 3e7b6bfd

lvhan028 authored Jul 05, 2023

* add performance

* use png

* update

* update

* update

* update

* update

3e7b6bfd

fix(kv_qparams.py): zp use min (#59) · ec53d63f

tpoisonooo authored Jul 05, 2023

* fix(kv_qparams.py): zp use min

* revert(qparams.py): revert format

* fix(kv_qparams.py): update formula

ec53d63f

docs(README): typo (#56) · 7396d8f6
tpoisonooo authored Jul 05, 2023

7396d8f6

[Feature] Stats Quantization Parameters for KV Cache (#45) · 3fff964d

pppppM authored Jul 05, 2023

* add cal qparams

* support offload inference

* add collect funtions (mod,weight)

* stats kv scales

* update init

* add user guide

* fix hints

* fix comments & support turbomind format

* update user guide

* fix slice kv cache error & support pileval dataset (used in llm-awq)

* fix wrong num heads slice

* update default dataset

* fix conflict

* fix hints

* fix hints

* add gitignore

3fff964d

docs(quantization): add more test (#53) · edb6eb86

tpoisonooo authored Jul 05, 2023

* docs(quantization): add more test

* revert(generate.sh): revert ninja

* revert(llama_config.ini): revert empty line

* fix(quantization.md): fix link error

edb6eb86

04 Jul, 2023 2 commits

Update quantization.md (#47) · fa7cbc7a
tpoisonooo authored Jul 04, 2023

fa7cbc7a

docs(project): add quantization test results (#46) · 197b3ee1

tpoisonooo authored Jul 04, 2023

* docs(README): update description

* docs(project): add quantization test results

* docs(README): reorder

* docs(quantization): add more description

* docs(README): remove openmmlab badge

* docs(README): scale up image

* docs(dir): add zh_cn subdir

197b3ee1