Unverified Commit 76ae8627 authored by pppppM's avatar pppppM Committed by GitHub
Browse files

update zh readme (#74)

parent 74a4f3c9
...@@ -38,7 +38,7 @@ LMDeploy 由 [MMDeploy](https://github.com/open-mmlab/mmdeploy) 和 [MMRazor](ht ...@@ -38,7 +38,7 @@ LMDeploy 由 [MMDeploy](https://github.com/open-mmlab/mmdeploy) 和 [MMRazor](ht
- **persistent batch 推理**:进一步优化模型执行效率。 - **persistent batch 推理**:进一步优化模型执行效率。
![PersistentBatchInference](https://github.com/open-mmlab/lmdeploy/assets/25839884/8f8b57b8-42af-4b71-ad74-e75f39b10694) ![PersistentBatchInference](https://github.com/InternLM/lmdeploy/assets/67539920/e3876167-0671-44fc-ac52-5a0f9382493e)
## 性能 ## 性能
...@@ -52,7 +52,7 @@ LMDeploy 由 [MMDeploy](https://github.com/open-mmlab/mmdeploy) 和 [MMRazor](ht ...@@ -52,7 +52,7 @@ LMDeploy 由 [MMDeploy](https://github.com/open-mmlab/mmdeploy) 和 [MMRazor](ht
TurboMind 的吞吐量超过 2000 token/s, 整体比 DeepSpeed 提升约 5% - 15%,比 huggingface transformers 提升 2.3 倍 TurboMind 的吞吐量超过 2000 token/s, 整体比 DeepSpeed 提升约 5% - 15%,比 huggingface transformers 提升 2.3 倍
![benchmark](https://github.com/InternLM/lmdeploy/assets/4560679/1aa64d01-621c-4b53-8e48-e66bc4636b3b) ![benchmark](https://github.com/InternLM/lmdeploy/assets/67539920/bb9fdf35-8dc5-41f5-ad5e-33df786665e3)
## 快速上手 ## 快速上手
...@@ -117,7 +117,7 @@ python3 lmdeploy.serve.client {server_ip_addresss}:33337 internlm ...@@ -117,7 +117,7 @@ python3 lmdeploy.serve.client {server_ip_addresss}:33337 internlm
python3 lmdeploy.app {server_ip_addresss}:33337 internlm python3 lmdeploy.app {server_ip_addresss}:33337 internlm
``` ```
![](https://github.com/open-mmlab/lmdeploy/assets/41138331/f4352172-d8b1-49aa-b658-50ce72b896a5) ![](https://github.com/InternLM/lmdeploy/assets/67539920/08d1e6f2-3767-44d5-8654-c85767cec2ab)
其他模型的部署方式,比如 LLaMA,vicuna,请参考[这里](docs/zh_cn/serving.md) 其他模型的部署方式,比如 LLaMA,vicuna,请参考[这里](docs/zh_cn/serving.md)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment