Commit c5d362f1 authored by raojy's avatar raojy
Browse files

updata readme

parent 408de0f5
......@@ -40,10 +40,41 @@ Qwen2.5-VL 从头开始训练了一个原生动态分辨率的 ViT,包括 CLIP
推荐使用镜像: harbor.sourcefind.cn:5443/dcu/admin/base/vllm:0.9.2-ubuntu22.04-dtk25.04.2-1226-das1.7-py3.10-20251226
- 挂载地址`-v`根据实际模型情况修改
- 挂载地址`-v``{docker_name}``{docker_image_name}`根据实际模型情况修改
```bash
docker run -it --shm-size 60g --network=host --name minimax_m2 --privileged --device=/dev/kfd --device=/dev/dri --device=/dev/mkfd --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /opt/hyhal/:/opt/hyhal/:ro -v /path/your_code_path/:/path/your_code_path/ harbor.sourcefind.cn:5443/dcu/admin/base/vllm:0.9.2-ubuntu22.04-dtk25.04.2-1226-das1.7-py3.10-20251226
docker run -it \
--shm-size 60g \
--network=host \
--name {docker_name} \
--privileged \
--device=/dev/kfd \
--device=/dev/dri \
--device=/dev/mkfd \
--group-add video \
--cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined \
-u root \
-v /opt/hyhal/:/opt/hyhal/:ro \
-v /path/your_code_data/:/path/your_code_data/ \
{docker_image_name} bash
示例如下(展示到modelzoo上的内容,就是将上面的{docker_image_name}{docker_name}根据实际模型填写)
docker run -it \
--shm-size 60g \
--network=host \
--name qwen3 \
--privileged \
--device=/dev/kfd \
--device=/dev/dri \
--device=/dev/mkfd \
--group-add video \
--cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined \
-u root \
-v /opt/hyhal/:/opt/hyhal/:ro \
-v /path/your_code_data/:/path/your_code_data/ \
image.sourcefind.cn:5000/dcu/admin/base/vllm:0.9.2-ubuntu22.04-dtk25.04.2-py3.10 bash
```
更多镜像可前往[光源](https://sourcefind.cn/#/service-list)下载使用。
......@@ -200,6 +231,10 @@ torchrun ./LLaMA-Factory/src/train.py \
## 推理
1. 推理框架有`transformers``vllm`、其他推理框架中任意一个即可,至少有一个;
### transformers
### 单机单卡
```
......@@ -233,10 +268,6 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 python inference.py
#### 单机推理
```
## serve启动
export ALLREDUCE_STREAM_WITH_COMPUTE=1
export VLLM_MLA_DISABLE=0
export VLLM_USE_FLASH_MLA=1
# 启动命令
vllm serve Qwen/Qwen2.5-VL-3B-Instruct \
......@@ -251,7 +282,7 @@ vllm serve Qwen/Qwen2.5-VL-3B-Instruct \
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "minimax",
"model": "qwen-vl",
"messages": [
{
"role": "user",
......@@ -272,23 +303,18 @@ curl http://localhost:8000/v1/chat/completions \
DCU与GPU精度一致,推理框架:vllm。
## 应用场景
### 算法类别
`对话问答`
### 热点应用行业
`科研,教育,政府,金融`
## 预训练权重
[ModelScope](https://modelscope.cn/)
- [Qwen2.5-VL-3B-Instruct](https://modelscope.cn/models/Qwen/Qwen2.5-VL-3B-Instruct)
- [Qwen2.5-VL-7B-Instruct](https://modelscope.cn/models/Qwen/Qwen2.5-VL-7B-Instruct)
- [Qwen2.5-VL-72B-Instruct](https://modelscope.cn/models/Qwen/Qwen2.5-VL-72B-Instruct)
| **模型名称** | **参数量** | **推荐 DCU 型号** | **最低卡数需求** | **下载地址 (Hugging Face)** |
| --------------------------- | ---------- | ----------------- | ---------------- | ------------------------------------------------------------ |
| **Qwen2.5-VL-3B-Instruct** | 3B | K100AI, BW1000 等 | 1 | [Hugging Face](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) |
| **Qwen2.5-VL-7B-Instruct** | 7B | K100AI, BW1000 等 | 1 | [Hugging Face](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) |
| **Qwen2.5-VL-72B-Instruct** | 72B | K100AI, BW1000 等 | 8 | [Hugging Face](https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct) |
## 源码仓库及问题反馈
源码仓库及问题反馈
- https://developer.sourcefind.cn/codes/modelzoo/Qwen2.5-vl_pytorch
## 参考资料
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment