Commits · 5c9e1e285e50c7be6cbcec04c47b4f0b929ede85 · OpenDAS / Lmdeploy

29 Nov, 2023 1 commit

Report first-token-latency and token-latency percentiles (#736) · 5c9e1e28

Lyu Han authored Nov 29, 2023

* update profile scripts

* add top_p, top_k and temperature as input arguments

* fix input_ids

* update profile_throughput

* update profile_restful_api

* update profile_serving

* update

* update

* add progress bar

* remove TODO comments

* update

* remove useless profile_* argument

* remove log level

* change concurrency default value to 64

* update restful_api.md

* update according to review comments

* fix docstring

5c9e1e28

24 Oct, 2023 1 commit
- Fix crash and remove `sys_instruct` from `chat.py` and `client.py`(#591) · ffe4ba9c
  Chen Xin authored Oct 24, 2023
```
* fix crash

* update profile_generation.py

* format

* use self.bos_id

* remove sys_instruct
```
  ffe4ba9c
26 Sep, 2023 1 commit
- fix benchmark serving cannot use Qwen tokenizer (#443) · 97dcdff7
  AllentDan authored Sep 26, 2023
```
* fix benchmark serving cannot use Qwen tokenizer

* update benchmark readme
```
  97dcdff7
18 Sep, 2023 1 commit

Profile token generation with more settings (#364) · dfa67e8c

AllentDan authored Sep 18, 2023

* better profiler

* wait for releasing mem

* remove fire

* remove support for multiple model benchmark

* comments

* output more details

* correct tp

dfa67e8c

23 Jul, 2023 1 commit

Refactor the chat template of supported models using factory pattern (#144) · 7b470f07

lvhan028 authored Jul 23, 2023

* refactor model.py and support baichuan-7b

* remove model_name

* remove hard session_len

* export tokenizer.py to target dir

* remove model_name from client

* remove model_name

* update

* correct throughput equation

* fix session.response

* update serving.md

* update readme

* update according to review comments

* update

* update

* update

* update

7b470f07

22 Jul, 2023 1 commit

add profile throughput benchmark (#146) · 2067862d

q.yao authored Jul 22, 2023



* add profile throughput benchmark

* add output only throughput

* update req/min

* update benckmark readme

* fix lint

---------
Co-authored-by: grimoire <yaoqian@pjlab.org.cn>

2067862d