Commits · bb3cce9a93d6c1d2a6f504afedec988e014587bf · OpenDAS / Lmdeploy

17 Oct, 2023 1 commit
- bump version to v0.0.11 (#567) · bb3cce9a
  Lyu Han authored Oct 17, 2023
  
  bb3cce9a
16 Oct, 2023 2 commits
- Move `tokenizer.py` to the folder of lmdeploy (#543) · c261b49d
  q.yao authored Oct 16, 2023
```
* move tokenizer

* remove Tokenizer in init

* update deploy.py
```
  c261b49d
- free runner disk (#552) · f4422fab
  Chen Xin authored Oct 16, 2023
```
* free runner disk

* limit cpu

* docker.yml

* keep swap

* keep swap
```
  f4422fab
13 Oct, 2023 3 commits

[doc] Update benchmark command in w4a16.md (#500) · 0b861c48

del-zhenwu authored Oct 13, 2023



* [doc] Update benchmark command in w4a16.md

* Update w4a16.md

* Update w4a16.md

add pip install nvidia-ml-py

* [doc] Update w4a16.md

* fix lint error
Signed-off-by: del-zhenwu <dele.zhenwu@gmail.com>

* [doc] update model_path & prompt_tokens
Signed-off-by: del-zhenwu <dele.zhenwu@gmail.com>

---------
Signed-off-by: del-zhenwu <dele.zhenwu@gmail.com>

0b861c48

Add tp hint for deployment (#555) · 77a26812
Chen Xin authored Oct 13, 2023
```
* add tp hint for deploy

* fix lint

* assert tp in turbomind

* fix lint
```
77a26812
Fix typing of openai protocol. (#554) · 6904053f
YiiSh authored Oct 13, 2023

6904053f

12 Oct, 2023 2 commits
- support deploy qwen-14b-chat (#482) · b21239a8
  Chen Xin authored Oct 12, 2023
```
* support deploy qwen-14b-chat

* update README

* load safetensors first
```
  b21239a8
- update huggingface internlm-chat-7b model url (#546) · 27e12477
  AllentDan authored Oct 12, 2023
  
  27e12477
11 Oct, 2023 3 commits
- [bug] fix mismatched shape for decoder output tensor (#517) · 0d2a151e
  akhoroshev authored Oct 11, 2023
  
  0d2a151e
- Fix typo in `docs/en/pytorch.md` (#539) · 169d088a
  Shahrukh Khan authored Oct 11, 2023
  
  169d088a
- make IPv6 compatible, safe run for coroutine interrupting (#487) · 759e1ddf
  AllentDan authored Oct 11, 2023
```
* make IPv6 compatible, safe run for coroutine interrupting

* instance_id -> session_id and fix api_client.py

* update doc

* remove useless faq

* safe ip mapping

* update app.py

* remove print

* update doc
```
  759e1ddf
09 Oct, 2023 3 commits
- set the default value of being 0 (#532) · fbd9770a
  Lyu Han authored Oct 10, 2023
  
  fbd9770a
- Change `shared_instance` type from `weakptr` to `shared_ptr` (#507) · 19fea86c
  Lyu Han authored Oct 09, 2023
```
* change shared_instances_ from weakptr to sharedptr

* update
```
  19fea86c
- Support CORS for openai api server (#481) · 02684144
  aisensiy authored Oct 09, 2023
```
* Support CORS for openai api server

* Remove unnecessary var

* Add CORS support follow the same style with vllm
```
  02684144
26 Sep, 2023 7 commits
- bump version to v0.0.10 (#474) · b58a9dff
  Lyu Han authored Sep 26, 2023
  
  b58a9dff
- Fix memory leak (#488) · 5d87c20f
  Lyu Han authored Sep 26, 2023
```
* Fix memory leak

* modern c++
```
  5d87c20f
- fix benchmark serving cannot use Qwen tokenizer (#443) · 97dcdff7
  AllentDan authored Sep 26, 2023
```
* fix benchmark serving cannot use Qwen tokenizer

* update benchmark readme
```
  97dcdff7
- Fix compatibility issues with Pydantic 2 (#465) · 22cd7d15
  aisensiy authored Sep 26, 2023
  
  22cd7d15
- fix race condition (#460) · a54e3e09
  akhoroshev authored Sep 26, 2023
  
  a54e3e09
- expose stop words and filter eoa (#352) · 327deaee
  AllentDan authored Sep 26, 2023
```
* expose stop words

* support string

* fix

* remove eoa from chatbot

* remove eoa of turbomind

* fix ut

* suffix wheel and fix InternLM no system bug
```
  327deaee
- [feature] Graceful termination of background threads in LlamaV2 (#458) · 0cc667e1
  akhoroshev authored Sep 26, 2023
```
* cuda allocator fix

* graceful termination

* lint and compilation fix
```
  0cc667e1
25 Sep, 2023 3 commits
- Miss meta instruction of internlm-chat model (#470) · ce9e0756
  Lyu Han authored Sep 25, 2023
  
  ce9e0756
- Fix side effect brought by supporting codellama: `sequence_start` is always... · e980377a
  Lyu Han authored Sep 25, 2023
```
Fix side effect brought by supporting codellama: `sequence_start` is always true when calling `model.get_prompt` (#466)
```
  e980377a
- Fix typo in README.md (#462) · 71945001
  Ikko Eltociear Ashimine authored Sep 25, 2023
```
quantilized -> quantized
```
  71945001
20 Sep, 2023 2 commits

bump version to v0.0.9 (#428) · 0be9e7ab
Lyu Han authored Sep 20, 2023

0be9e7ab

Support InternLM 20B (#440) · df7955de

Lyu Han authored Sep 20, 2023



* better profiler

* wait for releasing mem

* remove fire

* remove support for multiple model benchmark

* comments

* support actual seqlen

* change chat template

* update

* fix ut

* int->size_t

* output more details

* correct tp

* rollback

* update

* update readme

* add 'internlm-chat' as the default tag for internlm chat models

* rollback tokenizer

---------
Co-authored-by: AllentDan <AllentDan@yeah.net>
Co-authored-by: grimoire <yaoqian@pjlab.org.cn>

df7955de

19 Sep, 2023 1 commit
- rename readthedocs config file (#429) · 19ff47df
  RunningLeon authored Sep 19, 2023
  
  19ff47df
18 Sep, 2023 4 commits

Profile token generation with more settings (#364) · dfa67e8c

AllentDan authored Sep 18, 2023

* better profiler

* wait for releasing mem

* remove fire

* remove support for multiple model benchmark

* comments

* output more details

* correct tp

dfa67e8c

[Fix] Support actual seqlen in flash-attention2 (#418) · abe9f7bd

q.yao authored Sep 18, 2023

* support actual seqlen

* fix lint

* update variable types

* lint

* update type

* fix lint

---------

abe9f7bd

Fix token count bug (#416) · 3a7880a8
AllentDan authored Sep 18, 2023
```
* fix token count bug

* fix error response
```
3a7880a8

Reduce gil switching (#407) · d44a8bfe

Chen Xin authored Sep 18, 2023

* reduce gil switching

* ffi lock func

* remove unused

* remove unused

* remove unused

d44a8bfe

14 Sep, 2023 2 commits
- Fix memory leak (#415) · 2dec28ae
  Chen Xin authored Sep 14, 2023
  
  2dec28ae
- Fix build.md (#411) · ec034c15
  nlp-pang authored Sep 14, 2023
```
* fix the build step

* Fix the build step
```
  ec034c15
13 Sep, 2023 2 commits
- fix output[-1] when output is empty (#405) · 64c39dd8
  WRH authored Sep 13, 2023
  
  64c39dd8
- more general pypi ci (#412) · 2537c5ed
  Chen Xin authored Sep 13, 2023
  
  2537c5ed
12 Sep, 2023 1 commit
- Fix disk space limit for building docker image (#404) · e37915e5
  RunningLeon authored Sep 12, 2023
```
This reverts commit 7368b88692ecca3f5b39f92a8cc41cf21e3fd71e.
```
  e37915e5
11 Sep, 2023 3 commits

bump version to v0.0.8 (#401) · 450757b2
Lyu Han authored Sep 11, 2023

450757b2
[Fix] Update puyu model (#399) · cfec5bed
liukuikun authored Sep 11, 2023

cfec5bed

Support codellama (#359) · 65c662f9

Lyu Han authored Sep 11, 2023

* tmp

* add demo for codellama inference

* update

* update

* update

* update codellama.md

* export rope_theta

* update

* update doc

* fix client.py

* define SamplingParam

* rollback 'end'

* rotary_emb_base to rotary_embedding_base

* change to baichuan2-7b

65c662f9

08 Sep, 2023 1 commit

Support baichuan2-chat chat template (#378) · 55764e0b

WRH authored Sep 08, 2023



* support baichuan2-chat

* update args from generation config

* update deploy.py

* update readme

* tested with tp

* step-1 when last id is eos

* add news

---------
Co-authored-by: chenxin <chenxin@pjlab.org.cn>

55764e0b