"vscode:/vscode.git/clone" did not exist on "240abddfbce5e006997f2b53c1ff248e17b3b580"
Commit 78ba9d16 authored by Rayyyyy's avatar Rayyyyy
Browse files

Update GLM-4-0414

parent 7fa8c0b3
......@@ -94,10 +94,13 @@ python gen_messages_data.py --data_path /path/to/AdvertiseGen
- `tools` 字段为可选字段,若存在 `tools` 字段,其必须出现在 `system` 角色之后,且一个完整的对话数据(无论单轮或者多轮对话)只能出现一次 `tools` 字段。当 `tools` 字段存在时,`system` 角色必须存在并且 `content` 字段为空。
## 训练
[glm-4-9b-chat](https://huggingface.co/THUDM/glm-4-9b-chat)
通过[预训练权重](#预训练权重)下载预训练模型,当前用例使用[GLM-4-9B-chat](https://huggingface.co/THUDM/glm-4-9b-chat)[GLM-4-9B-0414](https://huggingface.co/THUDM/GLM-4-9B-0414)模型。
### GLM-4-9B-chat 原生训练方法
1. 进入`finetune_demo`目录下,首先安装所需环境信息:
```bash
cd finetune_demo
pip install -r requirements.txt
```
......@@ -145,18 +148,18 @@ pip install -r requirements.txt
+ `../checkpoints/glm-4-9b-chat/`: 模型地址
+ `configs/lora.yaml`: 配置文件地址
### 单机单卡
#### 单机单卡
```shell
bash train.sh
```
### 单机多卡/多机多卡
#### 单机多卡/多机多卡
这里使用`deepspeed`作为加速方案,请确认当前环境已经根据[环境配置章节](#环境配置)安装好了`deepspeed`库。
```shell
bash train_dp.sh
```
### 从保存点进行微调
#### 从保存点进行微调
如果按照上述方式进行训练,每次微调都会从头开始,如果你想从训练一半的模型开始微调,你可以加入第四个参数,这个参数有两种传入方式:
1. `yes`, 自动从**最后一个保存的Checkpoint**开始训练,例如:
......@@ -169,49 +172,80 @@ python finetune.py ../data/AdvertiseGen/saves/ ../checkpoints/glm-4-9b-chat/ con
python finetune.py ../data/AdvertiseGen/saves/ ../checkpoints/glm-4-9b-chat/ configs/lora.yaml 600
```
## 推理
进入[basic_demo](./basic_demo/)目录下
### Llama Factory 微调方法(推荐)
训练库安装(**非glm-4_pytorch目录下**),安装版本**大于 v0.9.2**`Llama-Factory`具体安装方法请参考仓库的README。
```
git clone https://developer.sourcefind.cn/codes/OpenDAS/llama-factory
```
#### 全参微调
SFT训练脚本示例,参考`llama-factory/train_full`下对应yaml文件。
**参数修改**
- **--model_name_or_path**: 修改为待训练模型地址,如 `/data/GLM-4-9B-0414`
- **--dataset**: 微调训练集名称,可选数据集请参考 `llama-factory/data/dataset_info.json`
- **--template**: 将 default 修改为 `glm4`
- **--output_dir**: 模型保存地址
### 快速调用
其他参数如:`--learning_rate``--save_steps`可根据自身硬件及需求进行修改。
#### lora微调
SFT训练脚本示例,参考`llama-factory/train_lora`下对应yaml文件。
参数解释同[#全参微调](#全参微调)
## 推理
### GLM-4-9B-Chat/GLM-4V-9B 模型推理脚本
**参数解释**
- --model_name_or_path:待测模型名或模型地址,当前默认"THUDM/glm-4-9b-chat"
- --device: 当前默认"cuda"
- --query: 待测输入语句,当前默认"你好"
- `--model_name_or_path`:待测模型名或模型地址,当前默认"THUDM/glm-4-9b-chat"
- `--device`: 当前默认"cuda"
- `--query`: 待测输入语句,当前默认"你好"
```bash
```
pip install -U huggingface_hub hf_transfer
export HF_ENDPOINT=https://hf-mirror.com/
cd basic_demo
python quick_start.py
```
### 使用命令行与 GLM-4-9B 模型进行对话
```bash
#### 使用命令行与 GLM-4-9B 模型进行对话
```
# chat
python trans_cli_demo.py --model_name_or_path ../checkpoints/glm-4-9b-chat
# 多模态
python trans_cli_vision_demo.py --model_name_or_path ../checkpoints/glm-4v-9b
```
### 使用 Gradio 网页端与 GLM-4-9B-Chat 模型进行对话
#### 使用 Gradio 网页端与 GLM-4-9B-Chat 模型进行对话
```
python trans_web_demo.py --model_name_or_path ../checkpoints/glm-4-9b-chat
```
### 验证微调后的模型
#### 验证微调后的模型
您可以在[finetune_demo/inference.py](./finetune_demo/inference.py) 中使用微调后的模型,执行方式如下。
```shell
```
python inference.py your_finetune_path
```
### GLM-4-9B-0414/GLM-4-32B-0414/GLM-4-32B-Base-0414 模型推理脚本
```
python infer_glm4.py --model_path /path/of/model/ --message "你好"
```
## result
- GLM-4-9B-Chat 推理结果
<div align=center>
<img src="./doc/glm4_9b_result.png" width=1500 heigh=400/>
</div>
- GLM-4-9B-0414 推理结果
<div align=center>
<img src="./doc/result.png" width=1500 heigh=400/>
<img src="./doc/glm4_9b_0414_result.png" width=1500 heigh=400/>
</div>
### 精度
数据集:AdvertiseGen
模型:GLM-4-9B-Chat
| device | iters | train_loss |
| :------: | :------: | :------: |
......@@ -225,8 +259,15 @@ python inference.py your_finetune_path
### 热点应用行业
家居,教育,科研
## 预训练权重
- [GLM-4-9B](https://huggingface.co/THUDM/glm-4-9b)
- [GLM-4-9B-chat](https://huggingface.co/THUDM/glm-4-9b-chat)
- [GLM-4-9B-0414](https://huggingface.co/THUDM/GLM-4-9B-0414)
- [GLM-4-32B-0414](https://huggingface.co/THUDM/GLM-4-32B-0414)
- [GLM-4-32B-Base-0414](https://huggingface.co/THUDM/GLM-4-32B-Base-0414)
## 源码仓库及问题反馈
- https://developer.hpccube.com/codes/modelzoo/glm4-9b_pytorch
- https://developer.sourcefind.cn/codes/modelzoo/glm-4_pytorch
## 参考资料
- https://github.com/THUDM/GLM-4
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
import json
import re
import ast
import argparse
from transformers import AutoModelForCausalLM, AutoTokenizer
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('--model_path', type=str, default="THUDM/GLM-4-9B-0414", help='模型路径.')
parser.add_argument('--message', default="北京和上海今天的天气情况", help='提问的问题.')
args = parser.parse_args()
return args
def is_function_call(single_message):
"""Determine whether the current system message is a function call."""
pattern = re.compile(r'([^\n`]*?)\n({.*?})(?=\w*\n|$)', re.DOTALL)
matches = pattern.findall(single_message)
if not matches:
return False
func_name, args_str = matches[0]
func_name = func_name.strip()
try:
parsed_args = json.loads(args_str)
except json.JSONDecodeError:
try:
parsed_args = ast.literal_eval(args_str)
except:
return False
return {"name": func_name, "arguments": parsed_args}
def realtime_aqi(city):
"""Weather Query Tool"""
if '北京' in city.lower():
return json.dumps({'city': '北京', 'aqi': '10', 'unit': 'celsius'}, ensure_ascii=False)
elif '上海' in city.lower():
return json.dumps({'city': '上海', 'aqi': '72', 'unit': 'fahrenheit'}, ensure_ascii=False)
else:
return json.dumps({'city': city, 'aqi': 'unknown'}, ensure_ascii=False)
def build_system_prompt(tools):
"""Construct system prompt based on the list of available tools."""
if tools is None:
tools = []
value = "# 可用工具"
contents = []
for tool in tools:
content = f"\n\n## {tool['function']['name']}\n\n{json.dumps(tool['function'], ensure_ascii=False, indent=4)}"
content += "\n在调用上述函数时,请使用 Json 格式表示调用的参数。"
contents.append(content)
value += "".join(contents)
return value
if __name__ == "__main__":
args = parse_args()
tokenizer = AutoTokenizer.from_pretrained(args.model_path)
model = AutoModelForCausalLM.from_pretrained(args.model_path, device_map="auto")
tools = [
{
"type": "function",
"function": {
"name": "realtime_aqi",
"description": "天气预报。获取实时空气质量。当前空气质量,PM2.5,PM10信息",
"parameters": {
"type": "object",
"properties": {
"city": {
"description": "城市名"
}
},
"required": [
"city"
]
}
}
}
]
system_prompt = build_system_prompt(tools)
message = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": args.message}
]
print(f"User Message: {message[-1]['content']}")
while True:
inputs = tokenizer.apply_chat_template(
message,
return_tensors="pt",
add_generation_prompt=True,
return_dict=True,
).to(model.device)
generate_kwargs = {
"input_ids": inputs["input_ids"],
"attention_mask": inputs["attention_mask"],
"max_new_tokens": 1024,
"do_sample": True,
}
out = model.generate(**generate_kwargs)
generate_resp = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:-1], skip_special_tokens=False)
stop_sequence = tokenizer.decode(out[0][-1:], skip_speical_tokens=False)
if stop_sequence == "<|user|>":
print(f"Assistant Response: {generate_resp.strip()}")
break
function_calls = []
for m in generate_resp.split("<|assistant|>"):
fc_decode = is_function_call(m.strip())
if fc_decode:
message.append({"role": "assistant", "metadata": fc_decode['name'], "content": json.dumps(fc_decode['arguments'], ensure_ascii=False)})
print(f"Function Call: {fc_decode}")
function_calls.append(fc_decode)
else:
message.append({"role": "assistant", "content": m})
print(f"Assistant Response: {m.strip()}")
for fc in function_calls:
function_response = realtime_aqi(
city=fc["arguments"]["city"],
)
print(f"Function Response: {function_response}")
message.append({"role": "observation", "content": function_response})
### model
model_name_or_path: THUDM/GLM-4-9B-0414
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: full
deepspeed: examples/deepspeed/ds_z3_config.json # choices: [ds_z0_config.json, ds_z2_config.json, ds_z3_config.json]
### dataset
dataset: identity,alpaca_en_demo
template: glm4
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/glm-4-9b/full/sft
logging_steps: 1
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 2
learning_rate: 1.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
resume_from_checkpoint: null
### eval
# eval_dataset: alpaca_en_demo
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500
### model
model_name_or_path: THUDM/GLM-4-9B-0414
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: lora
lora_rank: 8
lora_target: all
deepspeed: examples/deepspeed/ds_z3_config.json # choices: [ds_z0_config.json, ds_z2_config.json, ds_z3_config.json]
### dataset
dataset: identity,alpaca_en_demo
template: glm4
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/glm-4-9b/lora/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 2
learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
resume_from_checkpoint: null
### eval
# eval_dataset: alpaca_en_demo
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500
# 模型唯一标识
modelCode=684
# 模型名称
modelName=GLM-4-9B_pytorch
modelName=GLM-4_pytorch
# 模型描述
modelDescription=GLM-4-9B是智谱AI推出的最新一代预训练模型GLM-4系列中的开源版本,在语义、数学、推理、代码和知识等多方面的数据集测评中,GLM-4-9B及其人类偏好对齐的版本GLM-4-9B-Chat均表现出超越Llama-3-8B的卓越性能。
modelDescription=GLM-4系列是智谱AI推出的最新一代预训练模型的开源版本,在语义、数学、推理、代码和知识等多方面的数据集测评中,GLM-4-9B及其人类偏好对齐的版本GLM-4-9B-Chat均表现出超越Llama-3-8B的卓越性能。GLM-4-32B-0414 系列,320 亿参数,效果比肩 OpenAI 的 GPT 系列和 DeepSeek 的 V3/R1 系列,且支持非常友好的本地部署特性。
# 应用场景
appScenario=推理,训练,多轮对话,家居,教育,科研
# 框架类型
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment