Commit 5d85c437 authored by zhouxiang's avatar zhouxiang
Browse files

完善readme

parent cfc0817e
......@@ -35,7 +35,7 @@ mkdir build && cd build
sh ../generate.sh
make -j 32
make install
cd .. && python3 setup.py install
cd .. && pip uninstall lmdeploy && python3 setup.py install
```
### 模型下载
......@@ -59,6 +59,7 @@ lmdeploy chat turbomind --model_path ./workspace_yi-34b --tp 4
```
#### web页面方式交互
```shell
lmdeploy serve gradio --model_path_or_server ./workspace_yi-34b --server_name {server_ip} --server_port {port} --batch_size 32 --tp 4 --restful_api False
```
......@@ -70,7 +71,7 @@ lmdeploy serve gradio --model_path_or_server ./workspace_yi-34b --server_name {s
```shell
# --instance_num: turbomind推理实例的个数。可理解为支持的最大并发数
# --tp: 在 tensor parallel时,使用的GPU数量
lmdeploy serve api_server ./workspace_yi-34b --server_name {server_ip} --server_port ${server_port} --instance_num 32 --tp 1
lmdeploy serve api_server ./workspace_yi-34b --server_name ${server_ip} --server_port ${server_port} --instance_num 32 --tp 4
```
浏览器上打开 `http://{server_ip}:{server_port}`,即可访问 swagger,查阅 RESTful API 的详细信息。
......@@ -91,7 +92,7 @@ lmdeploy serve gradio restful_api_url --server_name ${server_ip} --server_port $
关于 RESTful API的详细介绍,请参考[这份](https://developer.hpccube.com/codes/aicomponent/lmdeploy/-/blob/dtk23.04-v0.0.13/docs/zh_cn/restful_api.md)文档。
## result
![llama](docs/yi-34b.gif)
![llama](docs/yi34b.gif)
### 精度
......@@ -110,7 +111,9 @@ lmdeploy serve gradio restful_api_url --server_name ${server_ip} --server_port $
## 源码仓库及问题反馈
https://developer.hpccube.com/codes/modelzoo/codellama_lmdeploy
https://developer.hpccube.com/codes/modelzoo/yi_lmdeploy
## 参考资料
https://github.com/InternLM/LMDeploy
https://github.com/01-ai/Yi
https://github.com/InternLM/LMDeploy
\ No newline at end of file
This image diff could not be displayed because it is too large. You can view the blob instead.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment