Commit fea70571 authored by chenzhuo's avatar chenzhuo
Browse files

readme_20240521

parent 75cf0593
Pipeline #1014 failed with stages
in 0 seconds
...@@ -13,7 +13,7 @@ LMDeploy 由 [MMDeploy](https://github.com/open-mmlab/mmdeploy) 和 [MMRazor](ht ...@@ -13,7 +13,7 @@ LMDeploy 由 [MMDeploy](https://github.com/open-mmlab/mmdeploy) 和 [MMRazor](ht
persistent batch 推理:进一步优化模型执行效率。<br> persistent batch 推理:进一步优化模型执行效率。<br>
LMdeploy官方github地址:[https://github.com/InternLM/lmdeploy](https://github.com/InternLM/lmdeploy)<br> LMdeploy官方github地址:[https://github.com/InternLM/lmdeploy](https://github.com/InternLM/lmdeploy)<br>
现在支持qwen1.5 详见sugon_readme.md **现在支持qwen1.5 详见sugon_readme.md**
## 暂不支持的官方功能 ## 暂不支持的官方功能
- **量化推理**:目前仅支持fp16的推理,awq-int4的权重量化和kv-cache int8推理方案暂不支持 - **量化推理**:目前仅支持fp16的推理,awq-int4的权重量化和kv-cache int8推理方案暂不支持
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment