Commit 7555423f authored by chenych's avatar chenych
Browse files

Modify readme and codes

parent 4d4e01e5
...@@ -62,18 +62,17 @@ vllm: 0.6.2+das.opt3.dtk2504 ...@@ -62,18 +62,17 @@ vllm: 0.6.2+das.opt3.dtk2504
git clone https://developer.sourcefind.cn/codes/OpenDAS/llama-factory git clone https://developer.sourcefind.cn/codes/OpenDAS/llama-factory
``` ```
2. 通过[预训练权重](#预训练权重)下载预训练模型,当前用例使用[Meta-Llama-3-8B-Instruct](http://113.200.138.88:18080/aimodels/Meta-Llama-3-8B-Instruct)模型。 2. 通过[预训练权重](#预训练权重)下载预训练模型,当前用例使用[Llama-4-Scout-17B-16E-Instruct](https://www.scnet.cn/ui/aihub/models/openaimodels/Llama-4-Scout-17B-16E-Instruct)模型。
#### 全参微调 #### 全参微调
SFT训练脚本示例,参考`llama-factory/train_full`下对应yaml文件。 SFT训练脚本示例,参考`llama-factory/train_full`下对应yaml文件。
**参数修改** **参数修改**
--model_name_or_path 修改为待训练模型地址,如 /data/Meta-llama3-models/Meta-Llama-3-8B-Instruct --model_name_or_path 修改为待训练模型地址,如 `/data/Llama-4-Scout-17B-16E-Instruct`
--dataset 微调训练集名称,可选数据集请参考/LLaMA-Factory-0.6.3/data/dataset_info.json --dataset 微调训练集名称,可选数据集请参考 `llama-factory/data/dataset_info.json`
--template 将 default 修改为 llama3 --template 将 default 修改为 `llama4`
--output_dir 模型保存地址 --output_dir 模型保存地址
--fp16 或 --bf16 开启混合精度,单精度可使用 --pure_bf16
其他参数如:--learning_rate、--save_steps可根据自身硬件及需求进行修改。 其他参数如:--learning_rate、--save_steps可根据自身硬件及需求进行修改。
#### lora微调 #### lora微调
...@@ -133,37 +132,15 @@ python infer_transformers.py --model_id /path_of/model_id ...@@ -133,37 +132,15 @@ python infer_transformers.py --model_id /path_of/model_id
制造,广媒,家居,教育 制造,广媒,家居,教育
## 预训练权重 ## 预训练权重
通过[SCNet AI社区模型库](https://www.scnet.cn/ui/aihub/models)下载预训练模型:
- [Llama-4-Scout-17B-16E](https://www.scnet.cn/ui/aihub/models/openaimodels/Llama-4-Scout-17B-16E)
模型目录结构如下: - [Llama-4-Scout-17B-16E-Instruct](https://www.scnet.cn/ui/aihub/models/openaimodels/Llama-4-Scout-17B-16E-Instruct)
```bash - [Llama-4-Maverick-17B-128E](https://www.scnet.cn/ui/aihub/models/openaimodels/Llama-4-Maverick-17B-128E)
├── model_save_path - [Llama-4-Maverick-17B-128E-Instruct](https://www.scnet.cn/ui/aihub/models/openaimodels/Llama-4-Maverick-17B-128E-Instruct)
│ ├── Meta-Llama-3-8B
│ ├── original
│ ├── consolidated.00.pth
│ ├── params.json
│ └── tokenizer.model
│ ├── config.json
│ ├── configuration.json
│ ├── generation_config.json
│ ├── LICENSE
│ ├── model-00001-of-00004.safetensors
│ ├── model-00002-of-00004.safetensors
│ ├── model-00003-of-00004.safetensors
│ ├── model-00004-of-00004.safetensors
│ ├── model.safetensors.index.json
│ ├── README.md
│ ├── special_tokens_map.json
│ ├── tokenizer_config.json
│ ├── tokenizer.json
│ └── USE_POLICY.md
```
## 源码仓库及问题反馈 ## 源码仓库及问题反馈
- https://developer.hpccube.com/codes/modelzoo/llama4_pytorch - https://developer.hpccube.com/codes/modelzoo/llama4_pytorch
## 参考资料 ## 参考资料
- https://github.com/meta-llama/llama3 - https://github.com/meta-llama/llama-models/tree/main/models/llama4
- https://github.com/InternLM/xtuner - https://github.com/hiyouga/LLaMA-Factory/
- https://github.com/meta-llama/llama-recipes
- https://github.com/hiyouga/LLaMA-Factory/tree/v0.6.3
...@@ -22,9 +22,8 @@ if __name__ == "__main__": ...@@ -22,9 +22,8 @@ if __name__ == "__main__":
device_map="auto", device_map="auto",
torch_dtype=torch.bfloat16, torch_dtype=torch.bfloat16,
) )
url1 = "datasets/rabbit.jpg"
url1 = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/0052a70beed5bf71b92610a43a52df6d286cd5f3/diffusers/rabbit.jpg" url2 = "datasets/cat_style_layout.png"
url2 = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/datasets/cat_style_layout.png"
messages = [ messages = [
{ {
"role": "user", "role": "user",
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment