Commits · a77d2d042f1e1ef3f20b0e4e964d0a4cb37889d3 · guobj / Qwen_lmdeploy

13 Jul, 2024 1 commit
- add scnet and icon · a77d2d04
  dongchy920 authored Jul 13, 2024
  
  a77d2d04
23 Feb, 2024 3 commits
- Update README.md · 551c7b25
  zhouxiang authored Feb 23, 2024
  
  551c7b25
- Update README.md · 9a426bac
  zhouxiang authored Feb 23, 2024
  
  9a426bac
- 更新模型下载地址为 · 51ac7c4f
  zhouxiang authored Feb 23, 2024
  
  51ac7c4f
31 Jan, 2024 4 commits
- 完善readme · 0080bc5a
  zhouxiang authored Jan 31, 2024
  
  0080bc5a
- 完善readme · 7ccc04b8
  zhouxiang authored Jan 31, 2024
  
  7ccc04b8
- 完善readme · eafad7d3
  zhouxiang authored Jan 31, 2024
  
  eafad7d3
- 更新lmdeploy版本到0.1.0 · 6dbf6277
  zhouxiang authored Jan 31, 2024
  
  6dbf6277
24 Jan, 2024 1 commit
- 修改安装步骤描述 · 32e069e1
  zhouxiang authored Jan 24, 2024
  
  32e069e1
11 Jan, 2024 1 commit
- 完善readme · 4efcbb70
  zhouxiang authored Jan 11, 2024
  
  4efcbb70
22 Dec, 2023 2 commits
- 完善readme · 789b5c20
  zhouxiang authored Dec 22, 2023
  
  789b5c20
- 新增qwen72b支持 · 6afe24b1
  zhouxiang authored Dec 22, 2023
  
  6afe24b1
30 Nov, 2023 2 commits
- Update README.md · f3f9a9a3
  xiabo authored Nov 30, 2023
  
  f3f9a9a3
- 添加数据文件 · 5c2f7531
  xiabo authored Nov 30, 2023
  
  5c2f7531
23 Nov, 2023 2 commits
- Update README.md · d592fbea
  xiabo authored Nov 23, 2023
  
  d592fbea
- Update README.md · f5657d46
  xiabo authored Nov 23, 2023
  
  f5657d46
22 Nov, 2023 5 commits
- Update README.md · ba02a3e3
  xiabo authored Nov 22, 2023
  
  ba02a3e3
- Update README.md · 059ba02f
  xiabo authored Nov 22, 2023
  
  059ba02f
- Update README.md · 243ff4a4
  xiabo authored Nov 22, 2023
  
  243ff4a4
- Update README.md · b2947549
  xiabo authored Nov 22, 2023
  
  b2947549
- Update README.md · 26265a69
  xiabo authored Nov 22, 2023
  
  26265a69
16 Nov, 2023 1 commit
- Update README.md · 6939e47c
  xiabo authored Nov 16, 2023
  
  6939e47c
15 Nov, 2023 1 commit
- Adapt to rocm 不适用flashattention2 · bc3c64aa
  xiabo authored Nov 15, 2023
  
  bc3c64aa
25 Oct, 2023 1 commit

Add more user-friendly CLI (#541) · 169d5169

RunningLeon authored Oct 25, 2023

* add

* import fire in main

* wrap to speed up fire cli

* update

* update docs

* update docs

* fix

* resolve commennts

* resolve confict and add test for cli

169d5169

19 Oct, 2023 1 commit
- add solar chat template (#576) · 70a5c63a
  AllentDan authored Oct 19, 2023
  
  70a5c63a
12 Oct, 2023 2 commits
- support deploy qwen-14b-chat (#482) · b21239a8
  Chen Xin authored Oct 12, 2023
```
* support deploy qwen-14b-chat

* update README

* load safetensors first
```
  b21239a8
- update huggingface internlm-chat-7b model url (#546) · 27e12477
  AllentDan authored Oct 12, 2023
  
  27e12477
25 Sep, 2023 1 commit
- Fix typo in README.md (#462) · 71945001
  Ikko Eltociear Ashimine authored Sep 25, 2023
```
quantilized -> quantized
```
  71945001
20 Sep, 2023 1 commit

Support InternLM 20B (#440) · df7955de

Lyu Han authored Sep 20, 2023



* better profiler

* wait for releasing mem

* remove fire

* remove support for multiple model benchmark

* comments

* support actual seqlen

* change chat template

* update

* fix ut

* int->size_t

* output more details

* correct tp

* rollback

* update

* update readme

* add 'internlm-chat' as the default tag for internlm chat models

* rollback tokenizer

---------
Co-authored-by: AllentDan <AllentDan@yeah.net>
Co-authored-by: grimoire <yaoqian@pjlab.org.cn>

df7955de

11 Sep, 2023 1 commit

Support codellama (#359) · 65c662f9

Lyu Han authored Sep 11, 2023

* tmp

* add demo for codellama inference

* update

* update

* update

* update codellama.md

* export rope_theta

* update

* update doc

* fix client.py

* define SamplingParam

* rollback 'end'

* rotary_emb_base to rotary_embedding_base

* change to baichuan2-7b

65c662f9

08 Sep, 2023 1 commit

Support baichuan2-chat chat template (#378) · 55764e0b

WRH authored Sep 08, 2023



* support baichuan2-chat

* update args from generation config

* update deploy.py

* update readme

* tested with tp

* step-1 when last id is eos

* add news

---------
Co-authored-by: chenxin <chenxin@pjlab.org.cn>

55764e0b

06 Sep, 2023 1 commit
- Update logo (#372) · e4701226
  Lyu Han authored Sep 06, 2023
  
  e4701226
05 Sep, 2023 1 commit
- [Doc] Fix quantization docs link (#367) · 683c3fe9
  Zhihao Lin authored Sep 05, 2023
  
  683c3fe9
29 Aug, 2023 2 commits

Add flashattention2 (#196) · 452822a4

q.yao authored Aug 29, 2023



* first

* fix causal mask

* disable flash attention2 on sm70

* fix 2

* update readme

* clang-format

* disable ft2 on windows

* fix lint

* fix build

* fix build

* fix long kv seq

* fix lint

* sync copy output

---------
Co-authored-by: grimoire <yaoqian@pjlab.org.cn>
Co-authored-by: irexyc <irexyc@gmail.com>

452822a4

fix(kvint8): update doc (#315) · a48e2d27

tpoisonooo authored Aug 29, 2023



* fix(kvint8): update doc

* style(lmdeploy): format

* style(kv_qparams.py): linting

* fix lint

* Update kv_int8.md

* Update kv_int8.md

---------
Co-authored-by: AllentDan <AllentDan@yeah.net>

a48e2d27

24 Aug, 2023 2 commits

Enable the Gradio server to call inference services through the RESTful API (#287) · 4279d8ca

AllentDan authored Aug 24, 2023



* app use async engine

* add stop logic

* app update cancel

* app support restful-api

* update doc and use the right model name

* set doc url root

* add comments

* add an example

* renew_session

* update readme.md

* resolve comments

* Update restful_api.md

* Update restful_api.md

* Update restful_api.md

---------
Co-authored-by: tpoisonooo <khj.application@aliyun.com>

4279d8ca

[Fix] Fix llama2 70b & qwen quantization error (#273) · d5cb0be2
pppppM authored Aug 24, 2023
```
* fix llama2 70b

* fix qwen quantization

* remove pdb

* add faq
```
d5cb0be2

21 Aug, 2023 2 commits
- docs(quantization): update description (#272) · f44ef17c
  tpoisonooo authored Aug 21, 2023
  
  f44ef17c
- add readthedocs (#208) · c238f1cd
  RunningLeon authored Aug 21, 2023
```
* add readthedocs configs

* update readme

* fix link

* update

* remove turbomind in api

* update

* fix comment and remove api
```
  c238f1cd
18 Aug, 2023 1 commit

[Feature] Support Qwen-7B, dynamic NTK scaling and logN scaling in turbomind (#230) · 4a60b45d

Li Zhang authored Aug 18, 2023

* qwen support

* dynamic ntk & logn attn

* fix ntk & add chat template

* fix ntk scaling & stop words

* fix lint

* add tiktoken to requirements.txt

* fix tokenizer, set model format automatically

* update model.py

* update readme

* fix lint

4a60b45d