v1.0

c0d96b32 · chenzk · c0d96b32 · c0d96b32 · c0d96b32 · c0d96b32
Commit c0d96b32 authored Jul 23, 2025 by chenzk
20 changed files
--- a/docker/requirements.txt
+++ b/docker/requirements.txt
+# for MiniCPM-2B hf inference
+torch>=2.0.0
+transformers==4.53.2
+gradio>=4.26.0
+
+# for vllm inference
+# vllm>=0.4.0.post1
+
+# for openai api inference
+openai>=1.17.1
+tiktoken>=0.6.0
+loguru>=0.7.2
+sentence_transformers>=2.6.1
+sse_starlette>=2.1.0
+
+# for MiniCPM-V hf inference
+Pillow>=10.3.0
+timm>=0.9.16
+sentencepiece>=0.2.0
--- a/finetune/README.md
+++ b/finetune/README.md
+# MiniCPM 微调
+
+（部分代码函数的文档由[RepoAgent](https://github.com/OpenBMB/RepoAgent)自动产生)
+
+[English Version](https://github.com/OpenBMB/MiniCPM/blob/main/finetune/README_en.md)
+
+本目录提供 MiniCPM-2B 模型的微调示例，包括全量微调和 PEFT。格式上，提供多轮对话微调样例和输入输出格式微调样例。
+
+如果将模型下载到了本地，本文和代码中的 `OpenBMB/MiniCPM-2B` 字段均应替换为相应地址以从本地加载模型。
+
+运行示例需要 `python>=3.10`，除基础的 `torch` 依赖外，示例代码运行还需要依赖。
+
+**我们提供了 [示例notebook](lora_finetune.ipynb) 用于演示如何以 AdvertiseGen 为例处理数据和使用微调脚本。**
+
+```bash
+pip install -r requirements.txt
+```
+
+## 测试硬件标准
+
+我们仅提供了单机多卡/多机多卡的运行示例，因此您需要至少一台具有多个 GPU 的机器。本仓库中的**默认配置文件**中，我们记录了显存的占用情况：
+
+ SFT 全量微调: 4张显卡平均分配，每张显卡占用 `30245MiB` 显存。
+ LORA 微调: 1张显卡，占用 `10619MiB` 显存。
+ qlora 微调+cpu+offload: 1张显卡，占用 `5500MiB` 显存。
+
+> 请注意，该结果仅供参考，对于不同的参数，显存占用可能会有所不同。请结合你的硬件情况进行调整。
+
+## 多轮对话格式
+
+多轮对话微调示例采用 ChatGLM3 对话格式约定，对不同角色添加不同 `loss_mask` 从而在一遍计算中为多轮回复计算 `loss`。
+
+对于数据文件，样例采用如下格式
+
+```json
+[
+  {
+    "messages": [
+      {
+        "role": "system",
+        "content": "<system prompt text>"
+      },
+      {
+        "role": "user",
+        "content": "<user prompt text>"
+      },
+      {
+        "role": "assistant",
+        "content": "<assistant response text>"
+      },
+      // ... Multi Turn
+      {
+        "role": "user",
+        "content": "<user prompt text>"
+      },
+      {
+        "role": "assistant",
+        "content": "<assistant response text>"
+      }
+    ]
+  }
+  // ...
+]
+```
+
+## 数据集格式示例
+
+> 请注意，现在的微调代码中加入了验证集，因此，对于一组完整的微调数据集，必须包含训练数据集和验证数据集，测试数据集可以不填写。或者直接用验证数据集代替。
+
+```
+{"messages": [{"role": "user", "content": "类型#裙*裙长#半身裙"}, {"role": "assistant", "content": "这款百搭时尚的仙女半身裙，整体设计非常的飘逸随性，穿上之后每个女孩子都能瞬间变成小仙女啦。料子非常的轻盈，透气性也很好，穿到夏天也很舒适。"}]}
+```
+
+## 开始微调
+
+通过以下代码执行 **单机多卡/多机多卡** 运行。
+
+```bash
+cd finetune
+bash sft_finetune.sh
+```
+
+通过以下代码执行 **单机单卡** 运行。
+
+```angular2html
+cd finetune
+bash lora_finetune.sh
+```
--- a/finetune/README_en.md
+++ b/finetune/README_en.md
+# MiniCPM Fine-tuning
+
+(Part of the doc inside these demo code is automatically generated by [RepoAgent](https://github.com/OpenBMB/RepoAgent))
+
+[中文版](https://github.com/OpenBMB/MiniCPM/blob/main/finetune/README.md)
+
+This directory provides examples of fine-tuning the MiniCPM-2B model, including full model fine-tuning and PEFT. In terms of format, we offer examples for multi-turn dialogue fine-tuning and input-output format fine-tuning.
+
+If you have downloaded the model to your local system, the `OpenBMB/MiniCPM-2B` field mentioned in this document and in the code should be replaced with the corresponding address to load the model from your local system.
+
+Running the example requires `python>=3.10`. Besides the basic `torch` dependency, additional dependencies are needed to run the example code.
+
+
+
+**We have provided an [example notebook](lora_finetune.ipynb) to demonstrate how to process data and use the fine-tuning script with AdvertiseGen as an example.**
+
+```bash
+pip install -r requirements.txt
+```
+
+## Testing Hardware Standard
+
+We only provide examples for single-node multi-GPU/multi-node multi-GPU setups, so you will need at least one machine with multiple GPUs. In the **default configuration file** in this repository, we have documented the memory usage:
+
+ SFT full parameters fine-tuning: Evenly distributed across 4 GPUs, each GPU consumes `30245MiB` of memory.
+ LORA fine-tuning: One GPU, consuming `10619MiB`  of memory.。
+
+> Please note that these results are for reference only, and memory consumption may vary with different parameters. Please adjust according to your hardware situation.
+
+## Multi-Turn Dialogue Format
+
+The multi-turn dialogue fine-tuning example adopts the ChatGLM3 dialogue format convention, adding different `loss_mask` for different roles, thus calculating `loss` for multiple replies in one computation.
+
+For the data file, the example uses the following format
+
+```json
+[
+  {
+    "messages": [
+      {
+        "role": "system",
+        "content": "<system prompt text>"
+      },
+      {
+        "role": "user",
+        "content": "<user prompt text>"
+      },
+      {
+        "role": "assistant",
+        "content": "<assistant response text>"
+      },
+      // ... Multi Turn
+      {
+        "role": "user",
+        "content": "<user prompt text>"
+      },
+      {
+        "role": "assistant",
+        "content": "<assistant response text>"
+      }
+    ]
+  }
+  // ...
+]
+```
+
+## Dataset Format Example
+
+> Please note, the fine-tuning code now includes a validation set, so for a complete set of fine-tuning datasets, it must contain training and validation datasets, while the test dataset is optional. Or, you can use the validation dataset in place of it.
+
+```
+{"messages": [{"role": "user", "content": "类型#裙*裙长#半身裙"}, {"role": "assistant", "content": "这款百搭时尚的仙女半身裙，整体设计非常的飘逸随性，穿上之后每个女孩子都能瞬间变成小仙女啦。料子非常的轻盈，透气性也很好，穿到夏天也很舒适。"}]}
+```
+
+## Start Fine-tuning
+
+Execute **single-node multi-GPU/multi-node multi-GPU** runs with the following code.
+
+```bash
+cd finetune
+bash sft_finetune.sh
+```
+
+Execute **single-node single-GPU** runs with the following code.
+
+```angular2html
+cd finetune
+bash lora_finetune.sh
+```
--- a/finetune/configs/ds_config_zero2.json
+++ b/finetune/configs/ds_config_zero2.json
+{
+    "fp16": {
+        "enabled": "auto",
+        "loss_scale": 0,
+        "loss_scale_window": 1000,
+        "initial_scale_power": 16,
+        "hysteresis": 2,
+        "min_loss_scale": 1
+    },
+    "bf16": {
+        "enabled": "auto"
+    },
+    "zero_optimization": {
+        "stage": 2,
+        "allgather_partitions": true,
+        "overlap_comm": true,
+        "reduce_scatter": true,
+        "contiguous_gradients": true
+    },
+    "train_batch_size": "auto",
+    "train_micro_batch_size_per_gpu": "auto",
+    "gradient_accumulation_steps": "auto",
+    "gradient_clipping": 1.0,
+    "wall_clock_breakdown": false,
+    "flops_profiler": {
+        "enabled": false,
+        "profile_step": 1,
+        "module_depth": -1,
+        "top_modules": 1,
+        "detailed": true,
+        "output_file": null
+    }
+}
\ No newline at end of file
--- a/finetune/configs/ds_config_zero2_offload.json
+++ b/finetune/configs/ds_config_zero2_offload.json
+{
+    "fp16": {
+        "enabled": "auto",
+        "loss_scale": 0,
+        "loss_scale_window": 1000,
+        "initial_scale_power": 16,
+        "hysteresis": 2,
+        "min_loss_scale": 1
+    },
+    "bf16": {
+        "enabled": "auto"
+    },
+    "zero_optimization": {
+        "stage": 2,
+        "allgather_partitions": true,
+        "overlap_comm": true,
+        "reduce_scatter": true,
+        "contiguous_gradients": true,
+        "offload_optimizer": {
+            "device": "cpu",
+            "pin_memory": true
+        }
+    },
+    "train_batch_size": "auto",
+    "train_micro_batch_size_per_gpu": "auto",
+    "gradient_accumulation_steps": "auto",
+    "gradient_clipping": 1.0,
+    "wall_clock_breakdown": false,
+    "flops_profiler": {
+        "enabled": false,
+        "profile_step": 1,
+        "module_depth": -1,
+        "top_modules": 1,
+        "detailed": true,
+        "output_file": null
+    }
+}
\ No newline at end of file
--- a/finetune/configs/ds_config_zero3.json
+++ b/finetune/configs/ds_config_zero3.json
+{
+    "fp16": {
+        "enabled": "auto",
+        "loss_scale": 0,
+        "loss_scale_window": 1000,
+        "initial_scale_power": 16,
+        "hysteresis": 2,
+        "min_loss_scale": 1
+    },
+    "bf16": {
+        "enabled": "auto"
+    },
+    "zero_optimization": {
+        "stage": 3,
+        "allgather_partitions": true,
+        "allgather_bucket_size": 5e8,
+        "reduce_scatter": true,
+        "contiguous_gradients": true,
+        "overlap_comm": true,
+        "reduce_bucket_size": "auto",
+        "stage3_prefetch_bucket_size": "auto",
+        "stage3_param_persistence_threshold": "auto",
+        "stage3_gather_16bit_weights_on_model_save": true
+    },
+    "train_batch_size": "auto",
+    "train_micro_batch_size_per_gpu": "auto",
+    "gradient_accumulation_steps": "auto",
+    "gradient_clipping": 1.0,
+    "wall_clock_breakdown": false,
+    "flops_profiler": {
+        "enabled": false,
+        "profile_step": 1,
+        "module_depth": -1,
+        "top_modules": 1,
+        "detailed": true,
+        "output_file": null
+    }
+}
\ No newline at end of file
--- a/finetune/configs/ds_config_zero3_offload.json
+++ b/finetune/configs/ds_config_zero3_offload.json
+{
+    "fp16": {
+        "enabled": "auto",
+        "loss_scale": 0,
+        "loss_scale_window": 1000,
+        "initial_scale_power": 16,
+        "hysteresis": 2,
+        "min_loss_scale": 1
+    },
+    "bf16": {
+        "enabled": "auto"
+    },
+    "zero_optimization": {
+        "stage": 3,
+        "allgather_partitions": true,
+        "allgather_bucket_size": 5e8,
+        "reduce_scatter": true,
+        "contiguous_gradients": true,
+        "overlap_comm": true,
+        "reduce_bucket_size": "auto",
+        "stage3_prefetch_bucket_size": "auto",
+        "stage3_param_persistence_threshold": "auto",
+        "stage3_gather_16bit_weights_on_model_save": true,
+        "offload_optimizer": {
+            "device": "cpu",
+            "pin_memory": true
+        },
+        "offload_param": {
+            "device": "cpu",
+            "pin_memory": true
+        }
+    },
+    "train_batch_size": "auto",
+    "train_micro_batch_size_per_gpu": "auto",
+    "gradient_accumulation_steps": "auto",
+    "gradient_clipping": 1.0,
+    "wall_clock_breakdown": false,
+    "flops_profiler": {
+        "enabled": false,
+        "profile_step": 1,
+        "module_depth": -1,
+        "top_modules": 1,
+        "detailed": true,
+        "output_file": null
+    }
+}
\ No newline at end of file
--- a/finetune/data/AdvertiseGenChatML/dev.json
+++ b/finetune/data/AdvertiseGenChatML/dev.json
+[
+    {
+        "messages": [
+            {
+                "role": "system",
+                "content": "你是一个广告生成助手，你的任务是根据用户提供的产品信息和广告类型，生成相应的广告文案。"
+            },
+            {
+                "role": "user",
+                "content": "类型#上衣*材质#牛仔布*颜色#白色*风格#简约*图案#刺绣*衣样式#外套*衣款式#破洞"
+            },
+            {
+                "role": "assistant",
+                "content": "简约而不简单的牛仔外套，白色的衣身十分百搭。衣身多处有做旧破洞设计，打破单调乏味，增加一丝造型看点。衣身后背处有趣味刺绣装饰，丰富层次感，彰显别样时尚。"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "system",
+                "content": "你是一个广告生成助手，你的任务是根据用户提供的产品信息和广告类型，生成相应的广告文案。"
+            },
+            {
+                "role": "user",
+                "content": "类型#裙*材质#针织*颜色#纯色*风格#复古*风格#文艺*风格#简约*图案#格子*图案#纯色*图案#复古*裙型#背带裙*裙长#连衣裙*裙领型#半高领"
+            },
+            {
+                "role": "assistant",
+                "content": "这款BRAND针织两件套连衣裙，简约的纯色半高领针织上衣，修饰着颈部线，尽显优雅气质。同时搭配叠穿起一条背带式的复古格纹裙，整体散发着一股怀旧的时髦魅力，很是文艺范。"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "system",
+                "content": "你是一个广告生成助手，你的任务是根据用户提供的产品信息和广告类型，生成相应的广告文案。"
+            },
+            {
+                "role": "user",
+                "content": "类型#上衣*风格#嘻哈*图案#卡通*图案#印花*图案#撞色*衣样式#卫衣*衣款式#连帽"
+            },
+            {
+                "role": "assistant",
+                "content": "嘻哈玩转童年，随时<UNK>，没错，出街还是要靠卫衣来装酷哦！时尚个性的连帽设计，率性有范还防风保暖。还有胸前撞色的卡通印花设计，靓丽抢眼更富有趣味性，加上前幅大容量又时尚美观的袋鼠兜，简直就是孩子耍帅装酷必备的利器。"
+            }
+        ]
+    }
+]
\ No newline at end of file
--- a/finetune/data/AdvertiseGenChatML/train.json
+++ b/finetune/data/AdvertiseGenChatML/train.json
+[
+    {
+        "messages": [
+            {
+                "role": "system",
+                "content": "你是一个广告生成助手，你的任务是根据用户提供的产品信息和广告类型，生成相应的广告文案。"
+            },
+            {
+                "role": "user",
+                "content": "类型#裤*版型#宽松*风格#性感*图案#线条*裤型#阔腿裤"
+            },
+            {
+                "role": "assistant",
+                "content": "宽松的阔腿裤这两年真的吸粉不少，明星时尚达人的心头爱。毕竟好穿时尚，谁都能穿出腿长2米的效果宽松的裤腿，当然是遮肉小能手啊。上身随性自然不拘束，面料亲肤舒适贴身体验感棒棒哒。系带部分增加设计看点，还让单品的设计感更强。腿部线条若隐若现的，性感撩人。颜色敲温柔的，与裤子本身所呈现的风格有点反差萌。"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "system",
+                "content": "你是一个广告生成助手，你的任务是根据用户提供的产品信息和广告类型，生成相应的广告文案。"
+            },
+            {
+                "role": "user",
+                "content": "类型#裙*风格#简约*图案#条纹*图案#线条*图案#撞色*裙型#鱼尾裙*裙袖长#无袖"
+            },
+            {
+                "role": "assistant",
+                "content": "圆形领口修饰脖颈线条，适合各种脸型，耐看有气质。无袖设计，尤显清凉，简约横条纹装饰，使得整身人鱼造型更为生动立体。加之撞色的鱼尾下摆，深邃富有诗意。收腰包臀,修饰女性身体曲线，结合别出心裁的鱼尾裙摆设计，勾勒出自然流畅的身体轮廓，展现了婀娜多姿的迷人姿态。"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "system",
+                "content": "你是一个广告生成助手，你的任务是根据用户提供的产品信息和广告类型，生成相应的广告文案。"
+            },
+            {
+                "role": "user",
+                "content": "类型#上衣*版型#宽松*颜色#粉红色*图案#字母*图案#文字*图案#线条*衣样式#卫衣*衣款式#不规则"
+            },
+            {
+                "role": "assistant",
+                "content": "宽松的卫衣版型包裹着整个身材，宽大的衣身与身材形成鲜明的对比描绘出纤瘦的身形。下摆与袖口的不规则剪裁设计，彰显出时尚前卫的形态。被剪裁过的样式呈现出布条状自然地垂坠下来，别具有一番设计感。线条分明的字母样式有着花式的外观，棱角分明加上具有少女元气的枣红色十分有年轻活力感。粉红色的衣身把肌肤衬托得很白嫩又健康。"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "system",
+                "content": "你是一个广告生成助手，你的任务是根据用户提供的产品信息和广告类型，生成相应的广告文案。"
+            },
+            {
+                "role": "user",
+                "content": "类型#裙*版型#宽松*材质#雪纺*风格#清新*裙型#a字*裙长#连衣裙"
+            },
+            {
+                "role": "assistant",
+                "content": "踩着轻盈的步伐享受在午后的和煦风中，让放松与惬意感为你免去一身的压力与束缚，仿佛要将灵魂也寄托在随风摇曳的雪纺连衣裙上，吐露出<UNK>微妙而又浪漫的清新之意。宽松的a字版型除了能够带来足够的空间，也能以上窄下宽的方式强化立体层次，携带出自然优雅的曼妙体验。"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "system",
+                "content": "你是一个广告生成助手，你的任务是根据用户提供的产品信息和广告类型，生成相应的广告文案。"
+            },
+            {
+                "role": "user",
+                "content": "类型#上衣*材质#棉*颜色#蓝色*风格#潮*衣样式#polo*衣领型#polo领*衣袖长#短袖*衣款式#拼接"
+            },
+            {
+                "role": "assistant",
+                "content": "想要在人群中脱颖而出吗？那么最适合您的莫过于这款polo衫短袖，采用了经典的polo领口和柔软纯棉面料，让您紧跟时尚潮流。再配合上潮流的蓝色拼接设计，使您的风格更加出众。就算单从选料上来说，这款polo衫的颜色沉稳经典，是这个季度十分受大众喜爱的风格了，而且兼具舒适感和时尚感。"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "system",
+                "content": "你是一个广告生成助手，你的任务是根据用户提供的产品信息和广告类型，生成相应的广告文案。"
+            },
+            {
+                "role": "user",
+                "content": "类型#上衣*版型#h*材质#蚕丝*风格#复古*图案#条纹*图案#复古*图案#撞色*衣样式#衬衫*衣领型#小立领"
+            },
+            {
+                "role": "assistant",
+                "content": "小女人十足的条纹衬衣，缎面一点点的复古，还有蓝绿色这种高级气质复古色，真丝材质，撞色竖条纹特别的现代感味道，直h型的裁剪和特别的衣长款式，更加独立性格。双层小立领，更显脸型。"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "system",
+                "content": "你是一个广告生成助手，你的任务是根据用户提供的产品信息和广告类型，生成相应的广告文案。"
+            },
+            {
+                "role": "user",
+                "content": "类型#裙*材质#网纱*颜色#粉红色*图案#线条*图案#刺绣*裙腰型#高腰*裙长#连衣裙*裙袖长#短袖*裙领型#圆领"
+            },
+            {
+                "role": "assistant",
+                "content": "这款连衣裙，由上到下都透出一丝迷人诱惑的女性魅力，经典圆领型，开口度恰好，露出你的迷人修长的脖颈线条，很是优雅气质，短袖设计，在这款上竟是撩人美貌，高腰线，散开的裙摆，到小腿的长度，遮住了腿部粗的部分，对身材有很好的修饰作用，穿起来很女神；裙身粉红色花枝重工刺绣，让人一眼难忘！而且在这种网纱面料上做繁复图案的绣花，是很考验工艺的，对机器的要求会更高，更加凸显我们的高品质做工；"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "system",
+                "content": "你是一个广告生成助手，你的任务是根据用户提供的产品信息和广告类型，生成相应的广告文案。"
+            },
+            {
+                "role": "user",
+                "content": "类型#上衣*颜色#纯色*图案#纯色*图案#文字*图案#印花*衣样式#卫衣"
+            },
+            {
+                "role": "assistant",
+                "content": "一款非常简洁大方的纯色卫衣，设计点在于胸前的“<UNK><UNK>”的中文字印花，新颖特别，让人眼前一亮。简单又吸睛的款式，而且不失时髦感，很适合个性年轻人。"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "system",
+                "content": "你是一个广告生成助手，你的任务是根据用户提供的产品信息和广告类型，生成相应的广告文案。"
+            },
+            {
+                "role": "user",
+                "content": "类型#上衣*版型#宽松*颜色#黑色*颜色#灰色*颜色#姜黄色*风格#休闲*图案#线条*图案#撞色*衣样式#毛衣*衣袖型#落肩袖"
+            },
+            {
+                "role": "assistant",
+                "content": "看惯了灰色的冷淡和黑色的沉闷感，来一点醒目的彩色增添点活力吧。亮眼又吸睛的姜黄色色调，嫩肤显白非常的有设计感。趣味的撞色和宽松的版型相交辉映，修饰身形小缺点的同时，时尚又百搭。优雅的落肩袖，轻松修饰肩部线条，让毛衣上身凸显出一丝慵懒随性的休闲感，时尚魅力尽显。"
+            }
+        ]
+    }
+]
\ No newline at end of file
--- a/finetune/data/ocnli_public_chatml/dev.json
+++ b/finetune/data/ocnli_public_chatml/dev.json
+[
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 身上裹一件工厂发的棉大衣,手插在袖筒里\n句子2：身上至少一件衣服\n"
+            },
+            {
+                "role": "assistant",
+                "content": "entailment"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 东、中、西部地区要加强多种形式的合作,在协调发展中逐步实现共同富裕\n句子2：东、中、西部地区发展存在不协调\n"
+            },
+            {
+                "role": "assistant",
+                "content": "entailment"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 外贸经营权进一步放开\n句子2：外贸经营权经历了先收缩再放开的过程。\n"
+            },
+            {
+                "role": "assistant",
+                "content": "neutral"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 一些地方财政收支矛盾较大\n句子2：地方经历了经济危机\n"
+            },
+            {
+                "role": "assistant",
+                "content": "neutral"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 等他回来,我们就出去吃啊.\n句子2：我们在等他\n"
+            },
+            {
+                "role": "assistant",
+                "content": "entailment"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 我觉得你看吧,咱们从专业角度,咱们刚刚说了,物理学咱们都没有办法去判断他到底怎么排名\n句子2：他学习物理学\n"
+            },
+            {
+                "role": "assistant",
+                "content": "entailment"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 否则,我们的高要求得不到落实,也影响了我们的低目标的实现\n句子2：我们的要求落实与否无所谓。\n"
+            },
+            {
+                "role": "assistant",
+                "content": "contradiction"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 因此,日本舆论曾盛传海部内阁是过渡性政权\n句子2：此舆论在中国传播范围同样广泛\n"
+            },
+            {
+                "role": "assistant",
+                "content": "neutral"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 阿嗲说,谢谢你们,生日还送了东西.\n句子2：阿嗲收到生日礼物\n"
+            },
+            {
+                "role": "assistant",
+                "content": "entailment"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 他当时每次喝酒喝大,有一次喝酒喝得太大我就把他拽到我们家,因为我知道他一定最后以吐出来为终点\n句子2：我去过他家\n"
+            },
+            {
+                "role": "assistant",
+                "content": "neutral"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 我我后来又给你写了一封信啊.\n句子2：其实收信人收到了第一封信,但是可能忘记了\n"
+            },
+            {
+                "role": "assistant",
+                "content": "neutral"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 每天都在打高尔夫球的.\n句子2：主人公生活得很规律。\n"
+            },
+            {
+                "role": "assistant",
+                "content": "neutral"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 要狠抓造林质量不放松\n句子2：种的树越多越好,至于成活率和质量并不需要考虑\n"
+            },
+            {
+                "role": "assistant",
+                "content": "contradiction"
+            }
+        ]
+    }
+]
+    
\ No newline at end of file
--- a/finetune/data/ocnli_public_chatml/train.json
+++ b/finetune/data/ocnli_public_chatml/train.json
+[
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 一月份跟二月份肯定有一个月份有.\n句子2：肯定有一个月份有\n"
+            },
+            {
+                "role": "assistant",
+                "content": "entailment"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 一月份跟二月份肯定有一个月份有.\n句子2：一月份有\n"
+            },
+            {
+                "role": "assistant",
+                "content": "neutral"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 一月份跟二月份肯定有一个月份有.\n句子2：一月二月都没有\n"
+            },
+            {
+                "role": "assistant",
+                "content": "contradiction"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 一点来钟时,张永红却来了\n句子2：一点多钟,张永红来了\n"
+            },
+            {
+                "role": "assistant",
+                "content": "entailment"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 不讲社会效果,信口开河,对任何事情都随意发议论,甚至信谣传谣,以讹传讹,那是会涣散队伍、贻误事业的\n句子2：以讹传讹是有害的\n"
+            },
+            {
+                "role": "assistant",
+                "content": "entailment"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 不讲社会效果,信口开河,对任何事情都随意发议论,甚至信谣传谣,以讹传讹,那是会涣散队伍、贻误事业的\n句子2：以讹传讹会被处罚\n"
+            },
+            {
+                "role": "assistant",
+                "content": "neutral"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 不讲社会效果,信口开河,对任何事情都随意发议论,甚至信谣传谣,以讹传讹,那是会涣散队伍、贻误事业的\n句子2：信口开河不会贻误事业\n"
+            },
+            {
+                "role": "assistant",
+                "content": "contradiction"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 严师母又哼了一声:你保证你没有别的心,却不能保证旁人没有\n句子2：你保证过你没有别的心\n"
+            },
+            {
+                "role": "assistant",
+                "content": "entailment"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 严师母又哼了一声:你保证你没有别的心,却不能保证旁人没有\n句子2：旁人有别的心\n"
+            },
+            {
+                "role": "assistant",
+                "content": "neutral"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 严师母又哼了一声:你保证你没有别的心,却不能保证旁人没有\n句子2：你一定能够保证旁人没有别的心\n"
+            },
+            {
+                "role": "assistant",
+                "content": "contradiction"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 中国人民勤劳智慧,具有无限的创新创造潜能,只要充分释放出来,中国的发展就一定会有更为广阔空间\n句子2：中国人民的创造潜能完全没有被释放出来\n"
+            },
+            {
+                "role": "assistant",
+                "content": "neutral"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 中国人民勤劳智慧,具有无限的创新创造潜能,只要充分释放出来,中国的发展就一定会有更为广阔空间\n句子2：中国人民没有创造潜能\n"
+            },
+            {
+                "role": "assistant",
+                "content": "contradiction"
+            }
+        ]
+    },
+    {
+        "messages": [
+            {
+                "role": "user",
+                "content": "请判断下边两个句子的关系属于 [entailment, neutral, contradiction]中的哪一种？\n句子1: 事实表明,美国侵犯别国国权威性,遑论侵犯人权了\n句子2：美国侵犯了别国国权威性\n"
+            },
+            {
+                "role": "assistant",
+                "content": "entailment"
+            }
+        ]
+    }
+]
\ No newline at end of file
--- a/finetune/data_processing.ipynb
+++ b/finetune/data_processing.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 1. 准备数据集\n",
+    "\n",
+    "将数据集转换为更通用的格式\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# 转换为 ChatML 格式\n",
+    "import os\n",
+    "import shutil\n",
+    "import json\n",
+    "\n",
+    "input_dir = \"data/AdvertiseGen\"\n",
+    "output_dir = \"data/mlx_AdvertiseGen\"\n",
+    "if os.path.exists(output_dir):\n",
+    "    shutil.rmtree(output_dir)\n",
+    "os.makedirs(output_dir, exist_ok=True)\n",
+    "\n",
+    "for fn in [\"train.json\", \"dev.json\"]:\n",
+    "    data_out_list = []\n",
+    "    with open(os.path.join(input_dir, fn), \"r\") as f, open(os.path.join(output_dir, fn), \"w\") as fo:\n",
+    "        for line in f:\n",
+    "            if len(line.strip()) > 0:\n",
+    "                data = json.loads(line)\n",
+    "                data_out = {\"input\":data['content'],'prompt':\"/n请为以下关键词生成一条广告语。\",'output':data['summary']}\n",
+    "                data_out_list.append(data_out)\n",
+    "\n",
+    "        for d in data_out_list:\n",
+    "            json_str = json.dumps(d,ensure_ascii=False)  # 将字典转换为JSON字符串\n",
+    "            fo.write(json_str + '\\n')  # 写入字符串并添加换行符\n",
+    "\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "base",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/finetune/finetune.py
+++ b/finetune/finetune.py
+# -*- coding: utf-8 -*-
+import json
+from dataclasses import dataclass, field
+from typing import Dict, Optional
+
+import torch
+import transformers
+from torch.utils.data import Dataset
+from transformers import (
+    AutoModelForCausalLM,
+    AutoTokenizer,
+    Trainer,
+    TrainingArguments,
+    BitsAndBytesConfig,
+)
+
+
+@dataclass
+class ModelArguments:
+    model_name_or_path: Optional[str] = field(default="openbmb/MiniCPM-2B-sft-bf16")
+
+
+@dataclass
+class DataArguments:
+    train_data_path: str = field(
+        default="data/AdvertiseGenChatML/train.json",
+        metadata={"help": "Path to the training data."},
+    )
+    eval_data_path: str = field(
+        default="data/AdvertiseGenChatML/dev.json",
+        metadata={"help": "Path to the test data."},
+    )
+
+
+@dataclass
+class TrainingArguments(transformers.TrainingArguments):
+    cache_dir: Optional[str] = field(default=None)
+    optim: str = field(default="adamw_torch")
+    model_max_length: int = field(
+        default=512,
+        metadata={
+            "help": "Maximum sequence length. Sequences will be right padded (and possibly truncated)."
+        },
+    )
+    use_lora: bool = field(default=False)
+    qlora: bool = field(default=False)
+
+
+class SupervisedDataset(Dataset):
+    """Dataset for supervised fine-tuning.
+    Loads the dataset from a json file and preprocesses it.example:
+    /data/AdvertiseGenChatML/train.json
+    """
+
+    def __init__(
+        self,
+        data_path,
+        tokenizer,
+        model_max_length=4096,
+    ):
+        super(SupervisedDataset, self).__init__()
+        self.data = json.load(open(data_path))
+        self.tokenizer = tokenizer
+        self.model_max_length = model_max_length
+        self.ignore_index = -100
+        item = self.preprocessing(self.data[0])
+        print("input:", self.tokenizer.decode(item["input_ids"]))
+        labels = []
+        for id_ in item["label_ids"]:
+            if id_ == -100:
+                continue
+            labels.append(id_)
+        print("label:", self.tokenizer.decode(labels))
+
+    def __len__(self):
+        return len(self.data)
+
+    def preprocessing(self, example):
+        input_ids = [self.tokenizer.bos_token_id]
+        label_ids = [self.ignore_index]
+
+        for message in example["messages"]:
+            role = message["role"]
+            content = message["content"]
+
+            content_ids = self.tokenizer.apply_chat_template([message])
+
+            if role == "user":
+                if self.tokenizer.eos_token_id == 73440:  # minicpm3.0 and minicpm4.0
+                    input_ids += self.tokenizer.apply_chat_template(
+                        [message], add_generation_prompt=True
+                    )
+                    label_ids += [self.ignore_index] * len(
+                        self.tokenizer.apply_chat_template(
+                            [message], add_generation_prompt=True
+                        )
+                    )
+                else: # minicpm2.0
+                    input_ids += content_ids
+                    label_ids += [self.ignore_index] * len(content_ids)
+            elif role == "system":
+                input_ids += content_ids
+                label_ids += [self.ignore_index] * len(content_ids)
+            elif role == "assistant":
+                if self.tokenizer.eos_token_id == 73440:  # minicpm3.0 and minicpm4.0
+                    input_ids += tokenizer.encode(content, add_special_tokens=False)
+                    label_ids += tokenizer.encode(content, add_special_tokens=False)
+                else: # minicpm2.0
+                    input_ids += content_ids
+                    label_ids += content_ids
+
+        input_ids.append(self.tokenizer.eos_token_id)
+        label_ids.append(self.tokenizer.eos_token_id)
+        # truncate to max len
+        input_ids = input_ids[: self.model_max_length]
+        label_ids = label_ids[: self.model_max_length]
+        attention_mask = [1] * len(input_ids)
+        # pad to max len
+        input_ids += [self.tokenizer.eos_token_id] * (
+            self.model_max_length - len(input_ids)
+        )
+        label_ids += [self.ignore_index] * (self.model_max_length - len(label_ids))
+        attention_mask += [0] * (self.model_max_length - len(attention_mask))
+        # convert to pt tensor
+        input_ids = torch.LongTensor(input_ids)
+        label_ids = torch.LongTensor(label_ids)
+        attention_mask = torch.LongTensor(attention_mask)
+        return {
+            "input_ids": input_ids,
+            "label_ids": label_ids,
+            "attention_mask": attention_mask,
+        }
+
+    def __getitem__(self, idx) -> Dict[str, torch.Tensor]:
+        return self.preprocessing(self.data[idx])
+
+
+def load_model_and_tokenizer(
+    model_path: str,
+    max_length: int = 4096,
+    use_lora: bool = True,
+    qlora: bool = False,
+    bf16: bool = False,
+    fp16: bool = False,
+):
+    """load model and tokenizer"""
+    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
+    tokenizer.pad_token = tokenizer.eos_token
+
+    assert not (bf16 and fp16), "bf16 or fp16, not both"
+    if bf16:
+        dtype = torch.bfloat16
+    elif fp16:
+        dtype = torch.float16
+    else:
+        dtype = torch.float32
+    if qlora:
+        assert use_lora, "use_lora must be True when use_qlora is True"
+        quantization_config = BitsAndBytesConfig(
+            load_in_4bit=True,  # 是否进行4bit量化
+            load_in_8bit=False,  # 是否进行8bit量化
+            bnb_4bit_compute_dtype=torch.float16,  # 计算精度设置
+            bnb_4bit_quant_storage=torch.uint8,  # 量化权重的储存格式
+            bnb_4bit_quant_type="nf4",  # 量化格式，这里用的是正太分布的int4
+            bnb_4bit_use_double_quant=True,  # 是否采用双量化，即对zeropoint和scaling参数进行量化
+            llm_int8_enable_fp32_cpu_offload=False,  # 是否llm使用int8，cpu上保存的参数使用fp32
+            llm_int8_has_fp16_weight=False,  # 是否启用混合精度
+            # llm_int8_skip_modules=["out_proj", "kv_proj", "lm_head"],  # 不进行量化的模块
+            llm_int8_threshold=6.0,  # llm.int8()算法中的离群值，根据这个值区分是否进行量化
+        )
+        model = AutoModelForCausalLM.from_pretrained(
+            model_path,
+            torch_dtype=dtype,
+            trust_remote_code=True,
+            quantization_config=quantization_config,
+        )
+    else:
+        model = AutoModelForCausalLM.from_pretrained(
+            model_path,
+            torch_dtype=dtype,
+            trust_remote_code=True,
+        )
+    if use_lora:
+        from peft import LoraConfig, TaskType, get_peft_model
+
+        lora_config = LoraConfig(
+            init_lora_weights="gaussian",
+            task_type=TaskType.CAUSAL_LM,
+            target_modules=(
+                ["q_a_proj", "kv_a_proj_with_mqa", "q_b_proj", "kv_b_proj"]
+                if model.config.architectures == ["MiniCPM3ForCausalLM"]
+                else ["q_proj", "v_proj"]
+            ),
+            r=64,
+            lora_alpha=32,
+            lora_dropout=0.1,
+            inference_mode=False,
+        )
+        model = get_peft_model(model, lora_config)
+        # trainable params: 2,949,120 || all params: 3,010,652,928 || trainable%: 0.09795616002669305
+        model.print_trainable_parameters()
+        # model.enable_input_require_grads()  # need when using adapter
+
+    return model, tokenizer
+
+
+if __name__ == "__main__":
+    model_path = "/mnt/data/user/tc_agi/yh/models/MiniCPM"
+    parser = transformers.HfArgumentParser(
+        (ModelArguments, DataArguments, TrainingArguments)
+    )
+    model_args, data_args, training_args = parser.parse_args_into_dataclasses()
+    model, tokenizer = load_model_and_tokenizer(
+        model_path=model_args.model_name_or_path,
+        max_length=training_args.model_max_length,
+        use_lora=training_args.use_lora,
+        qlora=training_args.qlora,
+        bf16=training_args.bf16,
+        fp16=training_args.fp16,
+    )
+
+    train_dataset = SupervisedDataset(
+        data_path=data_args.train_data_path,
+        tokenizer=tokenizer,
+        model_max_length=training_args.model_max_length,
+    )
+    eval_dataset = SupervisedDataset(
+        data_path=data_args.eval_data_path,
+        tokenizer=tokenizer,
+        model_max_length=training_args.model_max_length,
+    )
+
+    trainer = Trainer(
+        model=model,
+        args=training_args,
+        train_dataset=train_dataset,
+        eval_dataset=eval_dataset,
+        tokenizer=tokenizer,
+    )
+
+    trainer.train()
+    # save the incremental PEFT weights, more details can be found in https://huggingface.co/blog/peft
+    # model.save_pretrained("output_dir")
--- a/finetune/llama_factory_example/README.md
+++ b/finetune/llama_factory_example/README.md
+# MiniCPM_llama_factory 微调
+MiniCPM已经支持llama_factory微调，llama_factory支持continue_pretrain,sft,ppo,dpo,kto,orpo等等微调方式。
+由于llama_factory功能强大，但初学者较难上手，我们录制了微调教程
+
+**我们提供了 llama_factory_example文件夹，用来微调minicpm1b，minicpm2b模型。**
+1.首先安装llama_factory依赖。
+```bash
+git clone https://github.com/hiyouga/LLaMA-Factory
+cd LLaMA-Factory
+pip install -r requirements.txt
+```
+2.将数据集处理成Minicpm/finetune/llama_factory_example/llama_factory_data文件夹中的格式,示例包括dpo,kto,sft三种微调方式并放置到llama_factory/data目录下.以dpo为例：
+```json
+  [
+    {
+      "conversations": [
+        {
+          "from": "human",
+          "value": "Hi! I'd like to create a new language game simulating the first person perspective of a character named Angela."
+        }
+      ],
+      "chosen": {
+        "from": "gpt",
+        "value": "That sounds like a fun and engaging idea! Here are some tips to help you create the game:\n1. ......"
+      },
+      "rejected": {
+        "from": "gpt",
+        "value": "Hello! I'd be happy to help you create a language game simulating the first-person perspective ....."
+      }
+    }
+  ]
+```
+3.在llama_factory/data/dataset_info.json中添加数据集信息,保证dataset_info.json中能找到你的数据集，如下例：
+``` json
+  {"identity": {
+    "file_name": "identity.json"
+  },
+    "sft_zh_demo": {
+      "file_name": "alpaca_zh_demo.json"
+    },
+    "kto_en_demo": {
+      "file_name": "kto_en_demo.json",
+      "formatting": "sharegpt",
+      "columns": {
+        "messages": "messages",
+        "kto_tag": "label"
+      },
+      "tags": {
+        "role_tag": "role",
+        "content_tag": "content",
+        "user_tag": "user",
+        "assistant_tag": "assistant"
+      }
+    },
+    "dpo_en_demo": {
+      "file_name": "dpo_en_demo.json",
+      "ranking": true,
+      "formatting": "sharegpt",
+      "columns": {
+        "messages": "conversations",
+        "chosen": "chosen",
+        "rejected": "rejected"
+      }
+    }
+  }
+```
+4.将MiniCPM/finetune/llama_factory_example中文件复制到LLaMA-Factory/examples目录下。
+  ```bash
+    cd LLaMA-Factory/examples
+    mkdir minicpm
+    #以下代码中的/your/path要改成你的MiniCPM代码和LLaMA-Factory路径
+    cp -r /your/path/MiniCPM/finetune/llama_factory_example/*  /your/path/LLaMA-Factory/examples/minicpm
+  ```
+5.以dpo为例，首先修改minicpm_dpo.yaml,需要修改的：
+```yaml
+  model_name_or_path: openbmb/MiniCPM-2B-sft-bf16 #或者你本地保存的地址
+  dataset: dpo_en_demo #这里写dataset_info.json中的键名
+  output_dir: your/finetune_minicpm/save/path
+  bf16: true #如果你的设备支持bf16，否则false
+  deepspeed: examples/deepspeed/ds_z2_config.json #如果显存不够可以改成ds_z3_config.json
+```
+6.修改single_node.sh文件中：
+
+  - 1.如果是a100以及更高端服务器，删除以下两行
+  ```bash
+    export NCCL_P2P_DISABLE=1
+    export NCCL_IB_DISABLE=1 
+  ```
+  - 2.设置你希望参与微调的卡，以下示例为第1张到第8张卡都参与微调
+  ```bash
+    CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
+  ```
+  - 3.将以下代码src/train.py空格后方参数改为llama_facoty中minicpm_dpo.yaml的绝对路径
+  ```bash
+    src/train.py /root/ld/ld_project/LLaMA-Factory/examples/minicpm/minicpm_sft.yaml
+  ```
+7.执行：
+```bash
+  cd LLaMA-Factory
+  bash single_node.sh
+```
--- a/finetune/llama_factory_example/llama_factory_data/dpo_en_demo.json
+++ b/finetune/llama_factory_example/llama_factory_data/dpo_en_demo.json
--- a/finetune/llama_factory_example/llama_factory_data/kto_en_demo.json
+++ b/finetune/llama_factory_example/llama_factory_data/kto_en_demo.json
--- a/finetune/llama_factory_example/llama_factory_data/sft_zh_demo.json
+++ b/finetune/llama_factory_example/llama_factory_data/sft_zh_demo.json
--- a/finetune/llama_factory_example/minicpm_dpo.yaml
+++ b/finetune/llama_factory_example/minicpm_dpo.yaml
+### model
+model_name_or_path: /root/ld/ld_model_pretrained/MiniCPM4/
+
+### method
+stage: dpo
+do_train: true
+finetuning_type: full
+
+### ddp
+ddp_timeout: 180000000
+deepspeed: examples/deepspeed/ds_z2_config.json
+
+### dataset
+dataset: dpo_en_demo
+template: cpm3
+cutoff_len: 1200
+max_samples: 50000000
+overwrite_cache: true
+preprocessing_num_workers: 16
+
+
+### output
+output_dir: saves/minicpm/dpo
+logging_steps: 10
+save_steps: 500
+plot_loss: true
+overwrite_output_dir: true
+save_strategy: epoch
+### train
+per_device_train_batch_size: 2
+gradient_accumulation_steps: 4
+learning_rate: 0.00001
+num_train_epochs: 2.0
+lr_scheduler_type: cosine
+warmup_steps: 0.1
+bf16: true
+
+### eval
+val_size: 0.1
+per_device_eval_batch_size: 4
+evaluation_strategy: steps
+eval_steps: 500
--- a/finetune/llama_factory_example/minicpm_kto.yaml
+++ b/finetune/llama_factory_example/minicpm_kto.yaml
+### model
+model_name_or_path: /root/ld/ld_model_pretrained/MiniCPM4/
+
+### method
+stage: kto
+do_train: true
+finetuning_type: full
+kto_ftx: 0.1
+
+### ddp
+ddp_timeout: 180000000
+deepspeed: examples/deepspeed/ds_z2_config.json
+
+### dataset
+dataset: kto_harmless
+template: cpm3
+cutoff_len: 1200
+max_samples: 500000
+overwrite_cache: true
+preprocessing_num_workers: 16
+
+### output
+output_dir: saves/minicpm/kto
+logging_steps: 10
+save_steps: 500
+plot_loss: true
+overwrite_output_dir: true
+
+### train
+per_device_train_batch_size: 4
+gradient_accumulation_steps: 4
+learning_rate: 0.000005
+num_train_epochs: 1.0
+lr_scheduler_type: cosine
+warmup_steps: 0.1
+bf16: true
+
+### eval
+val_size: 0.1
+per_device_eval_batch_size: 16
+evaluation_strategy: steps
+eval_steps: 500
--- a/finetune/llama_factory_example/minicpm_sft.yaml
+++ b/finetune/llama_factory_example/minicpm_sft.yaml
+### model
+model_name_or_path: /root/ld/ld_model_pretrained/MiniCPM4/
+
+### method
+stage: sft
+do_train: true
+finetuning_type: full
+
+### ddp
+ddp_timeout: 180000000
+deepspeed: examples/deepspeed/ds_z2_config.json
+
+### dataset
+dataset: glaive_toolcall_en,glaive_toolcall_zh
+template: cpm3
+cutoff_len: 1800
+max_samples: 500000
+overwrite_cache: true
+preprocessing_num_workers: 16
+
+### output
+output_dir: saves/minicpm/fuction_call
+logging_steps: 10
+save_strategy: epoch
+plot_loss: true
+overwrite_output_dir: true
+
+### train
+per_device_train_batch_size: 2
+gradient_accumulation_steps: 4
+learning_rate: 0.0001
+num_train_epochs: 3.0
+lr_scheduler_type: cosine
+warmup_steps: 0.1
+bf16: true
+
+### eval
+val_size: 0.1
+per_device_eval_batch_size: 4
+evaluation_strategy: steps
+eval_steps: 500