"vscode:/vscode.git/clone" did not exist on "4acd87ff4e6a363b47d13b8a960b4b8340a3d615"
Commit 75cf0593 authored by chenzhuo's avatar chenzhuo
Browse files

readme

parent 374c78ca
Pipeline #1013 canceled with stages
......@@ -11,8 +11,9 @@ LMDeploy 由 [MMDeploy](https://github.com/open-mmlab/mmdeploy) 和 [MMRazor](ht
- **persistent batch 推理**:进一步优化模型执行效率。
persistent batch 推理:进一步优化模型执行效率。
LMdeploy官方github地址:[https://github.com/InternLM/lmdeploy](https://github.com/InternLM/lmdeploy)
persistent batch 推理:进一步优化模型执行效率。<br>
LMdeploy官方github地址:[https://github.com/InternLM/lmdeploy](https://github.com/InternLM/lmdeploy)<br>
现在支持qwen1.5 详见sugon_readme.md
## 暂不支持的官方功能
- **量化推理**:目前仅支持fp16的推理,awq-int4的权重量化和kv-cache int8推理方案暂不支持
......@@ -28,6 +29,10 @@ LMdeploy官方github地址:[https://github.com/InternLM/lmdeploy](https://github
| QWen-7B | Yes | Yes |
| QWen-14B | Yes | Yes |
| QWen-72B | Yes | Yes |
| QWen1.5-7B | Yes | Yes |
| QWen1.5-14B | Yes | Yes |
| QWen1.5-72B | Yes | Yes |
| QWen1.5-110B | Yes | Yes |
| Baichuan-7B | Yes | Yes |
| Baichuan2-7B | Yes | Yes |
| wizardlM | Yes | Yes |
......
### 修改部分(qwen1.5)
1.requirements/runtime.txt transformers==4.38.2
1.requirements/runtime.txt transformers==4.38.2<br>
2.lmdeploy/turbomind/deploy/source_model/qwen.py <br>
要将文件内容变成下面的 添加qwen的模型权重读取对应<br>
```python
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment