README.md 3.2 KB
Newer Older
zhanggezhong's avatar
zhanggezhong committed
1
2
# <div align="center"><strong>FastMoe</strong></div>
## 简介
Sugon_ldc's avatar
Sugon_ldc committed
3
4
5
一个易于使用和高效的系统,支持PyTorch的混合专家(MoE)模型。

## 安装
zhanggezhong's avatar
zhanggezhong committed
6
组件支持组合
Sugon_ldc's avatar
Sugon_ldc committed
7

zhanggezhong's avatar
zhanggezhong committed
8
9
10
11
12
   | PyTorch版本 | fastpt版本  |FastMoe版本      | DTK版本                  | Python版本       | 推荐编译方式 |
   | ----------- | ----------- | ----------- | ------------------------ | -----------------| ------------ |
   | 2.5.1       | 2.1.0       |1.1.0        | >= 25.04                 | 3.8、3.10、3.11  | fastpt不转码 |
   | 2.4.1       | 2.0.1       |1.1.0        | >= 25.04                 | 3.8、3.10、3.11  | fastpt不转码 |
   | 其他        | 其他         | 其他        | 其他                     | 3.8、3.10、3.11  | hip转码      |
Sugon_ldc's avatar
Sugon_ldc committed
13

zhanggezhong's avatar
zhanggezhong committed
14
+ pytorch版本大于2.4.1 && dtk版本大于25.04 推荐使用fastpt不转码编译。
Sugon_ldc's avatar
Sugon_ldc committed
15

zhanggezhong's avatar
zhanggezhong committed
16
17
18
19
20
21
22
### 1、使用pip方式安装
fastmoe whl包下载目录:[光和开发者社区](https://download.sourcefind.cn:65024/4/main/fastmoe),选择对应的pytorch版本和python版本下载对应fastmoe的whl包
```shell
pip install torch* (下载torch的whl包)
pip install fastpt* --no-deps (下载fastpt的whl包)
source  /usr/local/bin/fastpt -E
pip install fastmoe* (下载的fastmoe的whl包)
Sugon_ldc's avatar
Sugon_ldc committed
23
```
zhanggezhong's avatar
zhanggezhong committed
24
### 2、使用源码编译方式安装
Sugon_ldc's avatar
Sugon_ldc committed
25

zhanggezhong's avatar
zhanggezhong committed
26
27
#### 编译环境准备
提供基于fastpt不转码编译:
Sugon_ldc's avatar
Sugon_ldc committed
28

zhanggezhong's avatar
zhanggezhong committed
29
1. 基于光源pytorch基础镜像环境:镜像下载地址:[光合开发者社区](https://sourcefind.cn/#/image/dcu/pytorch),根据pytorch、python、dtk及系统下载对应的镜像版本。
Sugon_ldc's avatar
Sugon_ldc committed
30

zhanggezhong's avatar
zhanggezhong committed
31
32
33
34
35
36
37
38
2. 基于现有python环境:安装pytorch,fastpt whl包下载目录:[光合开发者社区](https://sourcefind.cn/#/image/dcu/pytorch),根据python、dtk版本,下载对应pytorch的whl包。安装命令如下:
```shell
pip install torch* (下载torch的whl包)
pip install fastpt* --no-deps (下载fastpt的whl包, 安装顺序,先安装torch,后安装fastpt)
pip install setuptools==59.5.0 wheel
pip install dm-tree
pip install pytest
pip install wheel
Sugon_ldc's avatar
Sugon_ldc committed
39
40
```

zhanggezhong's avatar
zhanggezhong committed
41
42
43
44
#### 源码编译安装
- 代码下载
```shell
git clone http://developer.sourcefind.cn/codes/OpenDAS/fastmoe.git # 根据编译需要切换分支
Sugon_ldc's avatar
Sugon_ldc committed
45
```
zhanggezhong's avatar
zhanggezhong committed
46
- 提供2种源码编译方式(进入fastmoe目录):
Sugon_ldc's avatar
Sugon_ldc committed
47
```
zhanggezhong's avatar
zhanggezhong committed
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
1. 设置不转码编译环境变量
source /usr/local/bin/fastpt -C

2. 编译whl包并安装
python3 setup.py -v bdist_wheel
pip install dist/fastmoe*

3. 源码编译安装
python3 setup.py install
```
#### 注意事项
+ 编译使用fastmoe时,所依赖的fastpt、torch、dtk之间版本必须严格对应,如fastpt2.0.1-dtk2504、torch-2.4.1-dtk2504、dtk2504和fastpt2.0.1-dtk25041、torch-2.4.1-dtk25041、dtk25041
+ 若使用pip install下载安装过慢,可添加pypi清华源:-i https://pypi.tuna.tsinghua.edu.cn/simple/
+ ROCM_PATH为dtk的路径,默认为/opt/dtk
+ 在pytorch2.5.1环境下编译需要支持c++17语法,打开setup.py文件,把文件中的 -std=c++14 修改为 -std=c++17

## 验证
- python -c "import fmoe; fmoe.\_\_version__",版本号与官方版本同步,查询该软件的版本号,例如1.1.0;
Sugon_ldc's avatar
Sugon_ldc committed
66

zhanggezhong's avatar
zhanggezhong committed
67
68
## Known Issue
-
Sugon_ldc's avatar
Sugon_ldc committed
69

zhanggezhong's avatar
zhanggezhong committed
70
71
72
73
## 参考资料
- [README_ORIGIN](README_ORIGIN.md)
- [README_zh-CN](README_zh-CN.md)
- [https://github.com/laekov/fastmoe](https://github.com/laekov/fastmoe)