README.md 3.34 KB
Newer Older
ACzhangchao's avatar
ACzhangchao committed
1
2
# JIUTIAN-139MoE-Chat

ACzhangchao's avatar
ACzhangchao committed
3
4
5
6
7
8
9
10
## 论文

https://www.modelscope.cn/models/JiuTian-AI/JIUTIAN-139MoE-chat/file/view/master?fileName=JIUTIAN-139MOE%2520TECHNICAL%2520REPORT.pdf&status=1

## 模型结构

JIUTIAN-139MoE是一个拥有130亿参数的大型语言模型,使用解码器型的MoE架构,包含一对大专家和六个小专家。模型支持在不同GPU和NPU集群上训练,并能无损切换。在FFN层采用MoE设计,有特殊的激活和路由机制。

ACzhangchao's avatar
ACzhangchao committed
11
![](https://developer.sourcefind.cn/codes/modelzoo/jiutian-139moe-chat/-/raw/main/jiutian.png?inline=false)
ACzhangchao's avatar
ACzhangchao committed
12
13
14
15
16

## 算法原理

JIUTIAN-139MoE利用Mixture-of-Experts (MoE) 架构,通过不同规模的专家网络处理不同的数据特征,并通过门控机制智能地分配任务给最合适的专家,从而提高模型处理复杂问题的能力。

ACzhangchao's avatar
ACzhangchao committed
17
![](https://developer.sourcefind.cn/codes/modelzoo/jiutian-139moe-chat/-/raw/main/MoE.png?inline=false)
ACzhangchao's avatar
ACzhangchao committed
18

ACzhangchao's avatar
ACzhangchao committed
19
20
## 环境配置

ACzhangchao's avatar
ACzhangchao committed
21
### Docker(方法一)
ACzhangchao's avatar
ACzhangchao committed
22
23
24
25
26

拉取镜像,启动并进入容器

```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
ACzhangchao's avatar
ACzhangchao committed
27
28
29
30
# <Image ID>用上面拉取docker镜像的ID替换
# <Host Path>主机端路径
# <Container Path>容器映射路径
docker run -it  --shm-size 80g --network=host --name=jiutian --privileged  --device /dev/m--device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v /opt/hyhal/:/opt/hyhal/:ro -v <Host Path>:<Container Path> <Image ID> /bin/bash
ACzhangchao's avatar
ACzhangchao committed
31
32
```

ACzhangchao's avatar
ACzhangchao committed
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
### Dockerfile(方法二)

```
# <Host Path>主机端路径
# <Container Path>容器映射路径
docker build -t jiutian:latest .
docker run -it  --shm-size 80g --network=host --name=jiutian --privileged  --device /dev/m--device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v /opt/hyhal/:/opt/hyhal/:ro -v <Host Path>:<Container Path> <Image ID> /bin/bash
```

### Anaconda(方法三)

```
conda create -n jiutian python=3.10
```

## 模型下载

ACzhangchao's avatar
ACzhangchao committed
50
51
52
53
54
55
```
#克隆项目
git clone http://developer.hpccube.com/codes/modelzoo/jiutian-139moe-chat.git
cd jiutian-139moe-chat
```

ACzhangchao's avatar
ACzhangchao committed
56
57
下载模型权重:[JIUTIAN-139MoE-Chat · 模型库 (modelscope.cn)](https://www.modelscope.cn/models/jiutian-ai/jiutian-139moe-chat/files)

ACzhangchao's avatar
ACzhangchao committed
58
59
60
61
## 数据集



ACzhangchao's avatar
ACzhangchao committed
62
63
64
65
66
## 训练



## 推理
ACzhangchao's avatar
ACzhangchao committed
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85

```
python inference.py
```

### FastAPI调用模型

```
python app.py
```

测试容器内部是否能够正常调用:

另起一个终端,输入

```
curl -X POST "http://localhost:8000/predict/" -H "Content-Type: application/json" -d '{"text": "Please introduce the Great Wall."}'
```

ACzhangchao's avatar
ACzhangchao committed
86
## result
ACzhangchao's avatar
ACzhangchao committed
87
88
89
90
91

```
{"response":"Human:\nPlease introduce the Great Wall.\n\nAssistant:\n The Great Wall of China is a series of fortifications built along the northern borders of China to protect against invasions and raids from various nomadic groups. It is one of the most famous landmarks in China and is also one of the largest construction projects in human history.\n\nThe Great Wall stretches"}
```

ACzhangchao's avatar
ACzhangchao committed
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
## 应用场景

### 算法类别

对话问答

### 热点应用行业

电信、能源、交通、航空、钢铁、金融、医疗、建筑

## 源码仓库及问题反馈

http://developer.hpccube.com/codes/modelzoo/jiutian-139moe-chat.git

## 参考资料

[JIUTIAN-139MoE-Chat · 模型库 (modelscope.cn)](https://www.modelscope.cn/models/JiuTian-AI/JIUTIAN-139MoE-chat)