README.md 2.41 KB
Newer Older
chenzk's avatar
v1.0.2  
chenzk committed
1
2
3
# <div align="center"><strong>Unsloth</strong></div>
## 简介
unsloth框架基于triton优化模型训练速度和显存占用,使用Unsloth微调Mistral、Gemma、Llama时,速度可提高2-5倍,内存使用可减少70%!
chenzk's avatar
v1.0  
chenzk committed
4

chenzk's avatar
v1.0.2  
chenzk committed
5
6
7
## 安装
组件支持
+ Python 3.10
chenzk's avatar
v1.0  
chenzk committed
8

chenzk's avatar
v1.0.2  
chenzk committed
9
### 1、使用源码编译方式安装
chenzk's avatar
v1.0  
chenzk committed
10

chenzk's avatar
v1.0.2  
chenzk committed
11
12
#### 编译环境准备
提供2种环境准备方式:
chenzk's avatar
v1.0  
chenzk committed
13

chenzk's avatar
v1.0.2  
chenzk committed
14
1. 基于光源pytorch基础镜像环境:镜像下载地址:[https://sourcefind.cn/#/image/dcu/pytorch](https://sourcefind.cn/#/image/dcu/pytorch),根据pytorch、python、dtk及系统下载对应的镜像版本。
chenzk's avatar
v1.0  
chenzk committed
15

chenzk's avatar
v1.0.2  
chenzk committed
16
17
18
2. 基于现有python环境:安装pytorch,pytorch whl包下载目录:[torch-2.1.0](https://cancon.hpccube.com:65024/4/main/pytorch/DAS1.1),根据python、dtk版本,下载对应pytorch的whl包。安装命令如下:
```shell
pip install torch* (下载的torch的whl包)
chenzk's avatar
v1.0  
chenzk committed
19
20
```

chenzk's avatar
v1.0.2  
chenzk committed
21
22
23
24
25
#### 源码编译安装
- 代码下载
```shell
git clone http://developer.hpccube.com/codes/OpenDAS/unsloth.git # 根据编译需要切换分支
```
chenzk's avatar
v1.0.1  
chenzk committed
26
```
chenzk's avatar
v1.0  
chenzk committed
27
28
cd unsloth
pip install .
chenzk's avatar
v1.0.1  
chenzk committed
29
```
chenzk's avatar
v1.0  
chenzk committed
30
31

```
chenzk's avatar
v1.0.2  
chenzk committed
32
# if modify unsloth by yourself, you can gitclone from github and use the newest of author, and pip install, then:
chenzk's avatar
v1.0  
chenzk committed
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
vim unsloth/kernels/cross_entropy_loss.py:
MAX_FUSED_SIZE = 65536 -> MAX_FUSED_SIZE = 16384
num_warps = 32 -> num_warps = 8 # 位于Fast_CrossEntropyLoss类的_chunked_cross_entropy_forward[(n_rows, n_chunks,)]下面

vim unsloth/kernels/utils.py
if   BLOCK_SIZE >= 32768: num_warps = 32 -> if   BLOCK_SIZE >= 32768: num_warps = 8
elif BLOCK_SIZE >=  8192: num_warps = 16 -> elif BLOCK_SIZE >=  8192: num_warps = 8
# 位于函数calculate_settings下面

vim unsloth/models/_utils.py
model_architectures = ["llama", "mistral", "gemma", "gemma2", "qwen2",] -> model_architectures = ["llama", "mistral", "qwen2",] 

vim unsloth/models/llama.py
Q = Q.transpose(1, 2) -> Q = Q.transpose(1, 2).half()
K = K.transpose(1, 2) -> K = K.transpose(1, 2).half()
V = V.transpose(1, 2) -> V = V.transpose(1, 2).half()
# 位于函数LlamaAttention_fast_forward的elif HAS_FLASH_ATTENTION and attention_mask is None下面
```
chenzk's avatar
v1.0.2  
chenzk committed
51
52
#### 注意事项
+ 若使用pip install下载安装过慢,可添加pypi清华源:-i https://pypi.tuna.tsinghua.edu.cn/simple/
chenzk's avatar
v1.0  
chenzk committed
53

chenzk's avatar
v1.0.2  
chenzk committed
54
55
## 验证
- python -c "import unsloth",显示结果:Unsloth: Will patch your computer to enable 2x faster free finetuning.
chenzk's avatar
v1.0  
chenzk committed
56

chenzk's avatar
v1.0.2  
chenzk committed
57
58
## Known Issue
-
chenzk's avatar
v1.0  
chenzk committed
59

chenzk's avatar
v1.0.2  
chenzk committed
60
61
62
## 参考资料
- [README_origin](README_origin.md)
- [Unsloth](https://github.com/unslothai/unsloth.git)