Commits · c261b49d5a4db2cdd26738e3df209e5a9068a479 · OpenDAS / Lmdeploy

16 Oct, 2023 1 commit
- Move `tokenizer.py` to the folder of lmdeploy (#543) · c261b49d
  q.yao authored Oct 16, 2023
```
* move tokenizer

* remove Tokenizer in init

* update deploy.py
```
  c261b49d
26 Sep, 2023 1 commit
- fix benchmark serving cannot use Qwen tokenizer (#443) · 97dcdff7
  AllentDan authored Sep 26, 2023
```
* fix benchmark serving cannot use Qwen tokenizer

* update benchmark readme
```
  97dcdff7
04 Sep, 2023 1 commit

Fix profile_serving hung issue (#344) · edb7c6ec

Lyu Han authored Sep 04, 2023

* read data after start processes

* fix hang

* fix exceptions when request_output_len is 0

edb7c6ec

22 Aug, 2023 1 commit

Add Restful API (#223) · d5c10e7a

AllentDan authored Aug 22, 2023

* add restful api

* refine

* add simple doc

* lint

* add uvicorn requirement

* more args

* add llama2

* docstring

* update doc

* save

* refine

* lint

* better decode

* add v1/embedding

* add GenerateRequest

* add llama2 chat template

* correct profiling

* update documents

* add length judge

* add faq

* update doc and rename req_que to req_queue

* fix md link, use get_logger, fix sequence_end bug

* use another doc link for go to avoid lint error

* add api_client.py

* update doc

* update doc

* update function interface

* update FAQ

* resolve comments

d5c10e7a

07 Aug, 2023 1 commit
- Improve postprocessing in TIS serving by applying Incremental de-tokenizing (#197) · 0ed1e4d4
  lvhan028 authored Aug 07, 2023
```
* change to incremental decoding

* update
```
  0ed1e4d4
31 Jul, 2023 1 commit
- Fix typo in profile_serving.py (#183) · 09c624ce
  del-zhenwu authored Jul 31, 2023
  
  09c624ce
23 Jul, 2023 1 commit

Refactor the chat template of supported models using factory pattern (#144) · 7b470f07

lvhan028 authored Jul 23, 2023

* refactor model.py and support baichuan-7b

* remove model_name

* remove hard session_len

* export tokenizer.py to target dir

* remove model_name from client

* remove model_name

* update

* correct throughput equation

* fix session.response

* update serving.md

* update readme

* update according to review comments

* update

* update

* update

* update

7b470f07

19 Jul, 2023 1 commit
- Fix concatenate bug in benchmark serving script (#134) · 39350031
  rollroll90 authored Jul 19, 2023
  
  39350031
30 Jun, 2023 2 commits
- rename serve/fastertransformer to serve/turbomind (#31) · e8ab4ba3
  lvhan028 authored Jun 30, 2023
```
* rename lmdeploy/serve/fastertransformer to lmdeploy/serve/turbomind

* update

* update
```
  e8ab4ba3
- rename llmdeploy to lmdeploy (#30) · 46f4738c
  lvhan028 authored Jun 30, 2023
```
* change llmdeploy to lmdeploy

* update logo

* update readme
```
  46f4738c
25 Jun, 2023 1 commit

Add profile (#15) · 23c05372

lvhan028 authored Jun 25, 2023

* remove constraints on model name

* remove duplicate model converter

* add profile

* get eos and bos from server

* update stop_words

* update sequence_length when the last generated token is eos_id

* fix

* fix

* check-in models

* valicate model_name

* make stop_words as property

* debug profiling

* better stats

* fix assistant reponse

* update profile serving

* update

* update

23c05372