Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
allamo_pytorch
Commits
207c6325
Commit
207c6325
authored
Nov 27, 2024
by
chenzk
Browse files
v1.0.2
parent
6dbb642d
Changes
5
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
19 additions
and
2 deletions
+19
-2
README.md
README.md
+5
-1
docker/requirements.txt
docker/requirements.txt
+11
-0
docker_start.sh
docker_start.sh
+1
-1
requirements.txt
requirements.txt
+0
-0
scripts/data/train_index.txt
scripts/data/train_index.txt
+2
-0
No files found.
README.md
View file @
207c6325
...
@@ -54,6 +54,7 @@ docker build --no-cache -t llama:latest .
...
@@ -54,6 +54,7 @@ docker build --no-cache -t llama:latest .
docker run --shm-size=64G --name llama -v /opt/hyhal:/opt/hyhal:ro --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video -v $PWD/../../allamo:/home/allamo -it llama bash
docker run --shm-size=64G --name llama -v /opt/hyhal:/opt/hyhal:ro --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video -v $PWD/../../allamo:/home/allamo -it llama bash
# 若遇到Dockerfile启动的方式安装环境需要长时间等待,可注释掉里面的pip安装,启动容器后再安装python库:pip install -r requirements.txt。
# 若遇到Dockerfile启动的方式安装环境需要长时间等待,可注释掉里面的pip安装,启动容器后再安装python库:pip install -r requirements.txt。
cd /home/allamo
cd /home/allamo
pip install -r requirements.txt
pip install -e . #安装allamo库
pip install -e . #安装allamo库
```
```
### Anaconda(方法三)
### Anaconda(方法三)
...
@@ -77,6 +78,7 @@ xformers:0.0.25
...
@@ -77,6 +78,7 @@ xformers:0.0.25
2、其它非特殊库参照requirements.txt安装
2、其它非特殊库参照requirements.txt安装
```
```
cd /home/allamo
cd /home/allamo
pip install -r requirements.txt
pip install -e . #安装allamo库
pip install -e . #安装allamo库
```
```
...
@@ -96,7 +98,7 @@ python prepare.py
...
@@ -96,7 +98,7 @@ python prepare.py
```
```
# 数据集制作方法二
# 数据集制作方法二
cd /home/allamo/scripts
cd /home/allamo/scripts
prepare_datasets.sh
sh
prepare_datasets.sh
```
```
代码能力较强的读者也可以选择huggingface开源的其它模型,根据以下Demo自己编写tokenlizer来制作预训练数据,本项目本身支持其它tokenlizer格式的数据,例如
`meta-llama/Llama-3.2-3B`
、
`Qwen/Qwen2.5-1.5B`
等小计算量tokenlizer都是较好选择:
代码能力较强的读者也可以选择huggingface开源的其它模型,根据以下Demo自己编写tokenlizer来制作预训练数据,本项目本身支持其它tokenlizer格式的数据,例如
`meta-llama/Llama-3.2-3B`
、
`Qwen/Qwen2.5-1.5B`
等小计算量tokenlizer都是较好选择:
...
@@ -129,6 +131,8 @@ wandb disabled
...
@@ -129,6 +131,8 @@ wandb disabled
wandb offline
wandb offline
cd /home/allamo
cd /home/allamo
mkdir /home/data/out-allamo-1B
python train.py --config="./train_configs/train_1B.json"# 或sh train.sh
python train.py --config="./train_configs/train_1B.json"# 或sh train.sh
# 其它功能正在优化中
# 其它功能正在优化中
```
```
...
...
docker/requirements.txt
View file @
207c6325
docker-pycreds==0.4.0
gitdb==4.0.11
gitpython==3.1.43
joblib==1.4.2
sentry-sdk==2.18.0
setproctitle==1.3.3
smmap==5.0.1
tiktoken==0.7.0
accelerate
transformers
wandb==0.18.7
\ No newline at end of file
docker_start.sh
View file @
207c6325
docker run
-it
--shm-size
=
32
G
-v
$PWD
/allamo:/home/allamo
-v
/p
arastor
/DL_DATA/
HOT
:/home/
HOT
-v
/opt/hyhal:/opt/hyhal:ro
--privileged
=
true
--device
=
/dev/kfd
--device
=
/dev/dri/
--group-add
video
--name
llama
f6b99c8a0f01
bash
docker run
-it
--shm-size
=
64
G
-v
$PWD
/allamo:/home/allamo
-v
/p
ublic
/DL_DATA/
AI
:/home/
AI
-v
/opt/hyhal:/opt/hyhal:ro
--privileged
=
true
--device
=
/dev/kfd
--device
=
/dev/dri/
--group-add
video
--name
llama
83714c19d308
bash
# python -m torch.utils.collect_env
# python -m torch.utils.collect_env
requirmens.txt
→
requir
e
men
t
s.txt
View file @
207c6325
File moved
scripts/data/train_index.txt
0 → 100644
View file @
207c6325
File
input.txt
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment