README.md 5.08 KB
Newer Older
chenzk's avatar
v1.0  
chenzk committed
1
# Qwen3
chenzk's avatar
v1.1.2  
chenzk committed
2
参数量仅为DeepSeek-R1的1/3,成本大幅下降,性能全面超越R1、OpenAI-o1等全球顶尖模型,将快思考与慢思考集成进同一个模型。
chenzk's avatar
v1.0  
chenzk committed
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

## 论文
`无`

## 模型结构
Qwen3采用通用的Decoder-Only结构,引入了MoE提升性能,首个「混合推理模型」,将「快思考」与「慢思考」集成进同一个模型。
<div align=center>
    <img src="./doc/qwen.png"/>
</div>

## 算法原理
将输入embedding后放入attention、ffn等提取特征,最后利用Softmax将解码器最后一层产生的未经归一化的分数向量(logits)转换为概率分布,其中每个元素表示生成对应词汇的概率,这使得模型可以生成一个分布,并从中选择最可能的词作为预测结果。

## 环境配置
```
mv Qwen3_pytorch Qwen3 # 去框架名后缀
```

### Docker(方法一)
```
chenzk's avatar
v1.1  
chenzk committed
23
24
docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:vllm0.8.4-ubuntu22.04-dtk25.04-rc7-das1.5-py3.10-20250429-dev-qwen3-only
# docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.4.1-ubuntu22.04-dtk25.04-py3.10-fixpy
chenzk's avatar
v1.1.1  
chenzk committed
25
# <your IMAGE ID>为以上拉取的docker的镜像ID替换,本镜像为:6e12a1c4ae4d
chenzk's avatar
v1.0  
chenzk committed
26
27
28
29
30
31
32
33
34
35
36
37
38
docker run -it --shm-size=64G -v $PWD/Qwen3:/home/Qwen3 -v /opt/hyhal:/opt/hyhal:ro --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name qwen3 <your IMAGE ID> bash
cd /home/Qwen3
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple
```
### Dockerfile(方法二)
```
cd /home/Qwen3/docker
docker build --no-cache -t qwen3:latest .
docker run --shm-size=64G --name qwen3 -v /opt/hyhal:/opt/hyhal:ro --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video -v $PWD/../../Qwen3:/home/Qwen3 -it qwen3 bash
# 若遇到Dockerfile启动的方式安装环境需要长时间等待,可注释掉里面的pip安装,启动容器后再安装python库:pip install -r requirements.txt。
```
### Anaconda(方法三)
1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装:
chenzk's avatar
chenzk committed
39
- https://developer.sourcefind.cn/tool/
chenzk's avatar
v1.0  
chenzk committed
40
41
42
43
44
45
```
DTK驱动:dtk2504
python:python3.10
torch:2.4.1
torchvision:0.19.1
triton:3.0.0
chenzk's avatar
v1.1  
chenzk committed
46
vllm:0.8.4
chenzk's avatar
v1.0  
chenzk committed
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
flash-attn:2.6.1
deepspeed:0.14.2
apex:1.4.0
transformers:4.51.0
```

`Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应。`

2、其它非特殊库参照requirements.txt安装
```
cd /home/Qwen3
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple
```

## 数据集
`无`

## 训练


## 推理
预训练权重目录结构:
```
/home/Qwen3/
    └── Qwen/Qwen3-8B
``` 

### 单机多卡
```
cd /home/Qwen3
chenzk's avatar
v1.1  
chenzk committed
77
78
79

# 方法一:pytorch推理
# 本项目以Qwen3-8B示例,其它Qwen3模型以此类推。
chenzk's avatar
v1.0  
chenzk committed
80
python infer_transformers.py
chenzk's avatar
v1.1  
chenzk committed
81

chenzk's avatar
v1.1.3  
chenzk committed
82
方法二:vllm推理
chenzk's avatar
v1.1  
chenzk committed
83
python infer_vllm.py # vllm=0.8.4
chenzk's avatar
v1.0  
chenzk committed
84
85
86
87
88
```

更多资料可参考源项目中的[`README_orgin`](./README_orgin.md)

## result
chenzk's avatar
v1.1  
chenzk committed
89
90
vllm推理效果示例:

chenzk's avatar
v1.0  
chenzk committed
91
92
93
94
95
96
97
`输入: `
```
prompt: "Give me a short introduction to large language models."
```

`输出:`
```
chenzk's avatar
v1.1  
chenzk committed
98
Generated text: "<think>\nOkay, the user wants a short introduction to large language models. Let me start by defining what they are. They're AI systems trained on massive text data, right? I should mention their ability to understand and generate human-like text. Maybe include examples like GPT or BERT.\n\nWait, the user might not know the difference between different models. Should I explain the training process? Like using unsupervised learning on vast datasets. Also, highlight their applications: answering questions, writing stories, coding. But keep it concise since it's supposed to be short.\n\nOh, and maybe touch on their significance in NLP. Emphasize that they can handle multiple languages and tasks. Need to make sure it's clear without too much jargon. Let me check if I'm missing any key points. Oh, scalability and adaptability could be important. Alright, structure it with a definition, how they work, applications, and impact. Keep each part brief.\n</think>\n\nLarge language models (LLMs) are advanced artificial intelligence systems trained on vast amounts of text data to understand and generate human-like language. They use deep learning techniques to process and produce coherent responses across multiple languages and tasks, such as answering questions, writing stories, coding, and more. By analyzing patterns in text, LLMs can adapt to diverse contexts, making them powerful tools for natural language processing (NLP) and a wide range of applications, from customer service to creative writing. Their ability to scale and learn from extensive datasets has revolutionized how machines interact with and understand human communication."
chenzk's avatar
v1.0  
chenzk committed
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
```

### 精度
DCU与GPU精度一致,推理框架:pytorch。

## 应用场景
### 算法类别
`对话问答`
### 热点应用行业
`制造,广媒,金融,能源,医疗,家居,教育`
## 预训练权重
魔搭社区下载地址为:[Qwen/Qwen3-8B](https://www.modelscope.cn/Qwen/Qwen3-8B.git)
## 源码仓库及问题反馈
- http://developer.sourcefind.cn/codes/modelzoo/Qwen3_pytorch.git
## 参考资料
- https://github.com/QwenLM/Qwen3.git