Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ChatGLM3-6B_pytorch
Commits
de80d0ec
Commit
de80d0ec
authored
Jun 17, 2025
by
change
Browse files
fix:add training method
parent
e2c9ec30
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
78 additions
and
7 deletions
+78
-7
README.md
README.md
+60
-2
finetune_demo/configs/sft.yaml
finetune_demo/configs/sft.yaml
+1
-1
finetune_demo/lora.sh
finetune_demo/lora.sh
+6
-2
finetune_demo/sft.sh
finetune_demo/sft.sh
+11
-2
No files found.
README.md
View file @
de80d0ec
...
@@ -95,8 +95,20 @@ site-packages/transformers/utils/versions.py 文件
...
@@ -95,8 +95,20 @@ site-packages/transformers/utils/versions.py 文件
```
```
## 数据集
## 数据集
单轮对话数据以
[
ADGEN
](
https://aclanthology.org/D19-1321.pdf
)
(
广告生成
)
数据集为例介绍代码的使用方法,该数据集任务为根据输入(content)生成一段广告词(summary),以下为下载地址:
-
[
Google Drive
](
https://drive.google.com/file/d/13_vf0xRTQsyneRKdD1bZIr93vBGOczrk/view?usp=sharing
)
或者
[
Tsinghua Cloud
](
https://cloud.tsinghua.edu.cn/f/b3f119a008264b1cabd1/?dl=1
)
无
下载处理好的 ADGEN 数据集,将解压后的AdvertiseGen目录放到
[
finetune_demo/data
](
./finetune_demo/data
)
目录下。数据集目录结构如下:
```
── AdvertiseGen
│ ├── dev.json
│ └── train.json
```
通过以下方式将数据集处理成模型需要的格式:
```
bash
cd
finetune_demo
python process.py
```
### 模型下载
### 模型下载
...
@@ -107,8 +119,54 @@ site-packages/transformers/utils/versions.py 文件
...
@@ -107,8 +119,54 @@ site-packages/transformers/utils/versions.py 文件
| ChatGLM3-6B-32K | 32k |
[
HuggingFace
](
https://huggingface.co/THUDM/chatglm3-6b-32k
)
\|
[
ModelScope
](
https://modelscope.cn/models/ZhipuAI/chatglm3-6b-32k
)
\|
[SCNet]
| ChatGLM3-6B-32K | 32k |
[
HuggingFace
](
https://huggingface.co/THUDM/chatglm3-6b-32k
)
\|
[
ModelScope
](
https://modelscope.cn/models/ZhipuAI/chatglm3-6b-32k
)
\|
[SCNet]
## 训练
## 训练
### SFT微调
无
#### 单轮对话微调
```
bash
# 需要4卡运行,否则会oom(自行指定空闲的可见卡)
export
HIP_VISIBLE_DEVICES
=
4,5,6,7
cd
./finetune_demo
bash sft.sh
```
注意:请根据自己的需求配置sft.sh中的模型路径、数据集路径;batchsize、学习率等参数在./finetune_demo/configs/sft.yaml;
#### 推理验证
对于输入输出格式的微调,可使用
`sft_inf.sh`
进行基本的推理验证。
在完成微调任务之后,我们可以查看到
`output`
文件夹下多了很多个
`checkpoint-*`
的文件夹,这些文件夹代表了训练的轮数。 我们选择最后一轮的微调权重,并使用inference进行导入。
注意:此时要将hf上下载的原生
`tokenizer_config.json`
和
`tokenization_chatglm.py`
两个文件放入要待测的 checkpoint 文件夹下,比如./finetune_demo/output/checkpoint-3000/
```
bash
cd
./finetune_demo
bash sft_inf.sh
```
### LORA微调
#### 单轮对话微调
```
bash
# 需要单卡运行,否则会oom(自行指定空闲的可见卡)
export
HIP_VISIBLE_DEVICES
=
7
cd
./fintune_demo
bash lora.sh
```
注意:请根据自己的需求配置其中的模型路径、数据集路径;batchsize、学习率等参数在 ./finetune_demo/configs/lora.yaml;
#### 推理验证
在完成微调任务之后,我们可以查看到
`output`
文件夹下多了很多个
`checkpoint-*`
的文件夹,这些文件夹代表了训练的轮数。 我们选择最后一轮的微调权重,并使用inference进行导入。
注意:经过LORA微调训练后的checkpoint无需复制原生GLM3的tokenizer文件到其目录下。
```
bash
cd
./finetune_demo
bash lora_inf.sh
```
## 推理
## 推理
...
...
finetune_demo/configs/sft.yaml
View file @
de80d0ec
...
@@ -12,7 +12,7 @@ training_args:
...
@@ -12,7 +12,7 @@ training_args:
# needed to be fit for the dataset
# needed to be fit for the dataset
learning_rate
:
5e-5
learning_rate
:
5e-5
# settings for data loading
# settings for data loading
per_device_train_batch_size
:
4
per_device_train_batch_size
:
1
dataloader_num_workers
:
16
dataloader_num_workers
:
16
remove_unused_columns
:
false
remove_unused_columns
:
false
# settings for saving checkpoints
# settings for saving checkpoints
...
...
finetune_demo/lora.sh
View file @
de80d0ec
export
HIP_VISIBLE_DEVICES
=
7
#!/bin/bash
python finetune_hf.py data/AdvertiseGen_fix /path/to/chatglm3-6b configs/lora.yaml
# This script is used to fine-tune a model using PyTorch with multiple GPUs.
export
HSA_FORCE_FINE_GRAIN_PCIE
=
1
set
-x
python finetune_hf.py /home/client/nb/AdvertiseGen_fix /home/model/ZhipuAI/chatglm3-6b configs/lora.yaml
finetune_demo/sft.sh
View file @
de80d0ec
export
HIP_VISIBLE_DEVICES
=
1,2,3,4
#!/bin/bash
# This script is used to fine-tune a model using PyTorch with multiple GPUs.
export
HSA_FORCE_FINE_GRAIN_PCIE
=
1
export
HSA_FORCE_FINE_GRAIN_PCIE
=
1
set
-x
# Run the script with 4 GPUs
# Ensure you have the necessary environment variables set for PyTorch and CUDA
# Example command to run the script with 4 GPUs
# Make sure to adjust the paths and configurations as needed
# Example command to run the script with 4 GPUs
# Note: Adjust the paths to your data and model as necessary
# Example command to run the script with 4 GPUs
torchrun
--standalone
--nnodes
=
1
--nproc_per_node
=
4 finetune_hf_sft.py
data
/AdvertiseGen_fix /
path/to
/chatglm3-6b configs/sft.yaml
torchrun
--standalone
--nnodes
=
1
--nproc_per_node
=
4 finetune_hf_sft.py
/home/client/nb
/AdvertiseGen_fix /
home/model/ZhipuAI
/chatglm3-6b configs/sft.yaml
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment