Commit 75cf0593 authored by chenzhuo's avatar chenzhuo
Browse files

readme

parent 374c78ca
Pipeline #1013 canceled with stages
...@@ -11,8 +11,9 @@ LMDeploy 由 [MMDeploy](https://github.com/open-mmlab/mmdeploy) 和 [MMRazor](ht ...@@ -11,8 +11,9 @@ LMDeploy 由 [MMDeploy](https://github.com/open-mmlab/mmdeploy) 和 [MMRazor](ht
- **persistent batch 推理**:进一步优化模型执行效率。 - **persistent batch 推理**:进一步优化模型执行效率。
persistent batch 推理:进一步优化模型执行效率。 persistent batch 推理:进一步优化模型执行效率。<br>
LMdeploy官方github地址:[https://github.com/InternLM/lmdeploy](https://github.com/InternLM/lmdeploy) LMdeploy官方github地址:[https://github.com/InternLM/lmdeploy](https://github.com/InternLM/lmdeploy)<br>
现在支持qwen1.5 详见sugon_readme.md
## 暂不支持的官方功能 ## 暂不支持的官方功能
- **量化推理**:目前仅支持fp16的推理,awq-int4的权重量化和kv-cache int8推理方案暂不支持 - **量化推理**:目前仅支持fp16的推理,awq-int4的权重量化和kv-cache int8推理方案暂不支持
...@@ -28,6 +29,10 @@ LMdeploy官方github地址:[https://github.com/InternLM/lmdeploy](https://github ...@@ -28,6 +29,10 @@ LMdeploy官方github地址:[https://github.com/InternLM/lmdeploy](https://github
| QWen-7B | Yes | Yes | | QWen-7B | Yes | Yes |
| QWen-14B | Yes | Yes | | QWen-14B | Yes | Yes |
| QWen-72B | Yes | Yes | | QWen-72B | Yes | Yes |
| QWen1.5-7B | Yes | Yes |
| QWen1.5-14B | Yes | Yes |
| QWen1.5-72B | Yes | Yes |
| QWen1.5-110B | Yes | Yes |
| Baichuan-7B | Yes | Yes | | Baichuan-7B | Yes | Yes |
| Baichuan2-7B | Yes | Yes | | Baichuan2-7B | Yes | Yes |
| wizardlM | Yes | Yes | | wizardlM | Yes | Yes |
......
### 修改部分(qwen1.5) ### 修改部分(qwen1.5)
1.requirements/runtime.txt transformers==4.38.2 1.requirements/runtime.txt transformers==4.38.2<br>
2.lmdeploy/turbomind/deploy/source_model/qwen.py <br> 2.lmdeploy/turbomind/deploy/source_model/qwen.py <br>
要将文件内容变成下面的 添加qwen的模型权重读取对应<br> 要将文件内容变成下面的 添加qwen的模型权重读取对应<br>
```python ```python
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment