"docs/vscode:/vscode.git/clone" did not exist on "7819f3f67c1624684013be75ba3243e28e535356"
Commit ef0d50b5 authored by zhaoying1's avatar zhaoying1
Browse files

update for multinode imp

parent 555d0cba
......@@ -87,6 +87,10 @@ site-packages/transformers/utils/versions.py 文件
训练前请参考[modeling_baichuan.py](./modeling_baichuan.py)修改模型文件夹中modeling_baichuan.py的`Attention`类的代码,主要(暂时)去除去torch2.X的依赖。
### 注意3
若不支持xformers,在多节点训练中可能会出现xformers相关报错:"ImportError: This modeling file reguires the following packages that were not found in your environment: xformers." ,您可通过直接将[modeling_baichuan.py](./modeling_baichuan.py)中xpos设置为None来解决,即注释import xformers相关代码,并设置`xops=None`
## 数据集
输入数据为放置在项目[fine-tune/data](./fine-tune/data)目录下的 json 文件,`fine-tune/data/belle_chat_ramdon_10k.json`,该样例数据是从 [multiturn_chat_0.8M](https://huggingface.co/datasets/BelleGroup/multiturn_chat_0.8M) 采样出 1 万条,并且做了格式转换。主要是展示多轮数据怎么训练,不保证效果。json 文件示例格式如下:
......@@ -151,7 +155,7 @@ bash run_ft.sh
1. 单机训练
```
cd fine-tune
bash run_lora.sh
bash lora_train.sh
```
......
hostfile=""
HIP_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 deepspeed --hostfile=$hostfile fine-tune.py \
--report_to "none" \
--data_path "data/test.json" \
--data_path "data/belle_chat_ramdon_10k.json" \
--model_name_or_path "../../baichuan2-13b-chat-hf" \
--output_dir "output" \
--model_max_length 64 \
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment