Commit 78b10a15 authored by Rayyyyy's avatar Rayyyyy
Browse files

Update README and change basic_demo/inference.py to basic_demo/quick_start.py

parent ce1b4aca
...@@ -62,7 +62,7 @@ pip install -r requirements.txt ...@@ -62,7 +62,7 @@ pip install -r requirements.txt
### 准备数据集 ### 准备数据集
本仓库以[ADGEN](https://aclanthology.org/D19-1321.pdf) (广告生成) 数据集为例介绍代码的使用方法,可通过[Google Drive](https://drive.google.com/file/d/13_vf0xRTQsyneRKdD1bZIr93vBGOczrk/view?usp=sharing) 或者 [Tsinghua Cloud](https://cloud.tsinghua.edu.cn/f/b3f119a008264b1cabd1/?dl=1)下载处理好的 ADGEN 数据集。数据集下载完成后,将数据解压到[data](./data)目录下。 本仓库以[ADGEN](https://aclanthology.org/D19-1321.pdf) (广告生成) 数据集为例介绍代码的使用方法,可通过[Google Drive](https://drive.google.com/file/d/13_vf0xRTQsyneRKdD1bZIr93vBGOczrk/view?usp=sharing) 或者 [Tsinghua Cloud](https://cloud.tsinghua.edu.cn/f/b3f119a008264b1cabd1/?dl=1)下载处理好的 ADGEN 数据集。数据集下载完成后,将数据解压到[data](./data)目录下。
数据按路径存放后,执行下面的数据转换代码,生成的`dev.jsonl``train.jsonl`默认保存在`AdvertiseGen/saves`目录下: 数据按路径存放后,执行下面的数据转换代码,生成的`dev.jsonl` `train.jsonl`默认保存在`AdvertiseGen/saves`目录下:
``` ```
python gen_messages_data.py --data_path /path/to/AdvertiseGen python gen_messages_data.py --data_path /path/to/AdvertiseGen
``` ```
...@@ -90,11 +90,8 @@ python gen_messages_data.py --data_path /path/to/AdvertiseGen ...@@ -90,11 +90,8 @@ python gen_messages_data.py --data_path /path/to/AdvertiseGen
{"messages": [{"role": "system", "content": "", "tools": [{"type": "function", "function": {"name": "get_recommended_books", "description": "Get recommended books based on user's interests", "parameters": {"type": "object", "properties": {"interests": {"type": "array", "items": {"type": "string"}, "description": "The interests to recommend books for"}}, "required": ["interests"]}}}]}, {"role": "user", "content": "Hi, I am looking for some book recommendations. I am interested in history and science fiction."}, {"role": "assistant", "content": "{\"name\": \"get_recommended_books\", \"arguments\": {\"interests\": [\"history\", \"science fiction\"]}}"}, {"role": "observation", "content": "{\"books\": [\"Sapiens: A Brief History of Humankind by Yuval Noah Harari\", \"A Brief History of Time by Stephen Hawking\", \"Dune by Frank Herbert\", \"The Martian by Andy Weir\"]}"}, {"role": "assistant", "content": "Based on your interests in history and science fiction, I would recommend the following books: \"Sapiens: A Brief History of Humankind\" by Yuval Noah Harari, \"A Brief History of Time\" by Stephen Hawking, \"Dune\" by Frank Herbert, and \"The Martian\" by Andy Weir."}]} {"messages": [{"role": "system", "content": "", "tools": [{"type": "function", "function": {"name": "get_recommended_books", "description": "Get recommended books based on user's interests", "parameters": {"type": "object", "properties": {"interests": {"type": "array", "items": {"type": "string"}, "description": "The interests to recommend books for"}}, "required": ["interests"]}}}]}, {"role": "user", "content": "Hi, I am looking for some book recommendations. I am interested in history and science fiction."}, {"role": "assistant", "content": "{\"name\": \"get_recommended_books\", \"arguments\": {\"interests\": [\"history\", \"science fiction\"]}}"}, {"role": "observation", "content": "{\"books\": [\"Sapiens: A Brief History of Humankind by Yuval Noah Harari\", \"A Brief History of Time by Stephen Hawking\", \"Dune by Frank Herbert\", \"The Martian by Andy Weir\"]}"}, {"role": "assistant", "content": "Based on your interests in history and science fiction, I would recommend the following books: \"Sapiens: A Brief History of Humankind\" by Yuval Noah Harari, \"A Brief History of Time\" by Stephen Hawking, \"Dune\" by Frank Herbert, and \"The Martian\" by Andy Weir."}]}
``` ```
- `system` 角色为可选角色,但若存在 `system` 角色,其必须出现在 `user` - `system` 角色为可选角色,但若存在 `system` 角色,其必须出现在 `user` 角色之前,且一个完整的对话数据(无论单轮或者多轮对话)只能出现一次 `system` 角色。
角色之前,且一个完整的对话数据(无论单轮或者多轮对话)只能出现一次 `system` 角色。 - `tools` 字段为可选字段,若存在 `tools` 字段,其必须出现在 `system` 角色之后,且一个完整的对话数据(无论单轮或者多轮对话)只能出现一次 `tools` 字段。当 `tools` 字段存在时,`system` 角色必须存在并且 `content` 字段为空。
- `tools` 字段为可选字段,若存在 `tools` 字段,其必须出现在 `system`
角色之后,且一个完整的对话数据(无论单轮或者多轮对话)只能出现一次 `tools` 字段。当 `tools` 字段存在时,`system`
角色必须存在并且 `content` 字段为空。
## 训练 ## 训练
预训练模型可通过[THUDM/GLM-4](https://github.com/THUDM/GLM-4)**Model List**章节下载,当前用例默认预训练模型为:**GLM-4-9B-Chat** 预训练模型可通过[THUDM/GLM-4](https://github.com/THUDM/GLM-4)**Model List**章节下载,当前用例默认预训练模型为:**GLM-4-9B-Chat**
...@@ -185,12 +182,14 @@ python finetune.py ../data/AdvertiseGen/saves/ ../checkpoints/glm-4-9b-chat/ con ...@@ -185,12 +182,14 @@ python finetune.py ../data/AdvertiseGen/saves/ ../checkpoints/glm-4-9b-chat/ con
pip install -U huggingface_hub hf_transfer pip install -U huggingface_hub hf_transfer
export HF_ENDPOINT=https://hf-mirror.com/ export HF_ENDPOINT=https://hf-mirror.com/
python inference.py python quick_start.py
``` ```
### 使用命令行与 GLM-4-9B 模型进行对话 ### 使用命令行与 GLM-4-9B 模型进行对话
``` ```bash
# chat
python trans_cli_demo.py --model_name_or_path ../checkpoints/GLM-4-9B-Chat python trans_cli_demo.py --model_name_or_path ../checkpoints/GLM-4-9B-Chat
# 多模态
python trans_cli_vision_demo.py --model_name_or_path ../checkpoints/GLM-4V-9B python trans_cli_vision_demo.py --model_name_or_path ../checkpoints/GLM-4V-9B
``` ```
...@@ -200,7 +199,8 @@ python trans_web_demo.py --model_name_or_path ../checkpoints/GLM-4-9B-Chat ...@@ -200,7 +199,8 @@ python trans_web_demo.py --model_name_or_path ../checkpoints/GLM-4-9B-Chat
``` ```
### 验证微调后的模型 ### 验证微调后的模型
您可以在 `finetune_demo/inference.py` 中使用微调后的模型,仅需要一行代码就能简单的进行测试。 您可以在[finetune_demo/inference.py](./finetune_demo/inference.py) 中使用微调后的模型,执行方式如下。
```shell ```shell
python inference.py your_finetune_path python inference.py your_finetune_path
``` ```
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment