Commit 0a0618ac authored by Rayyyyy's avatar Rayyyyy
Browse files

Add some explain

parent 9aa9b60f
......@@ -62,6 +62,7 @@ pip install -e .
### xtuner微调方法
1. 训练库安装,请注意所需库版本
```bash
pip uninstall fflash-attn # 2.0.4+82379d7.abi0.dtk2404.torch2.1
pip install deepspeed-0.12.3+das1.0+gita724046.abi0.dtk2404.torch2.1.0-cp310-cp310-manylinux2014_x86_64.whl
pip install -U xtuner # 0.1.18
pip install mmengine==0.10.3
......@@ -74,7 +75,7 @@ python download_models.py
```
2. 修改[llama3_8b_instruct_qlora_alpaca_e3_M.py](./llama3_8b_instruct_qlora_alpaca_e3_M.py)代码中的`pretrained_model_name_or_path``data_path`为本地模型、数据地址;
3. 根据硬件环境和自身训练需求来调整`max_length``batch_size``accumulative_counts``max_epochs``lr``save_steps``evaluation_freq`、model.lora中的`r``lora_alpha`参数,默认参数支持4*32G;
4. ${DCU_NUM}参数修改为要使用的DCU卡数量,不同数据集需要修改llama3_8b_instruct_qlora_alpaca_e3_M.py中`SYSTEM``evaluation_inputs``dataset_map_fn``train_dataloader.sampler``train_cfg`参数设置,详情请参考代码注释项,当前默认alpaca数据集
4. ${DCU_NUM}参数修改为要使用的DCU卡数量,不同数据集需要修改llama3_8b_instruct_qlora_alpaca_e3_M.py中`SYSTEM``evaluation_inputs``dataset_map_fn``train_dataloader.sampler``train_cfg`参数设置,详情请参考代码注释项,当前默认alpaca数据集**`--work-dir`设定保存模型路径**
5. 执行
```bash
bash finetune.sh
......
......@@ -186,7 +186,7 @@ default_hooks = dict(
# save checkpoint per `save_steps`.
checkpoint=dict(
type=CheckpointHook,
# by_epoch=False,
by_epoch=False, # save checkpoints by steps
interval=save_steps,
max_keep_ckpts=save_total_limit),
# set sampler seed in distributed evrionment.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment