Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
Vary_pytorch
Commits
6e60f780
Commit
6e60f780
authored
May 27, 2024
by
wanglch
Browse files
Update README.md
parent
a62da366
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
61 deletions
+2
-61
README.md
README.md
+2
-61
No files found.
README.md
View file @
6e60f780
...
@@ -78,67 +78,8 @@ pip install ninja
...
@@ -78,67 +78,8 @@ pip install ninja
-
本项目gitlab地址
[
Ucas-HaoranWei/Vary
](
https://github.com/Ucas-HaoranWei/Vary
)
-
本项目gitlab地址
[
Ucas-HaoranWei/Vary
](
https://github.com/Ucas-HaoranWei/Vary
)
## 训练
## 训练
需自己构建数据集
无
1.
For Vary-base
```
Shell
deepspeed vary/train/train_qwen_vary.py --deepspeed /vary/zero_config/zero2.json
--model_name_or_path /Qwen-7B/path/
--vision_tower /vit-large-patch14/path/
--freeze_vision_tower True
--freeze_lm_model False
--vision_select_layer -2
--use_im_start_end True
--bf16 True
--per_device_eval_batch_size 4
--gradient_accumulation_steps 1
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 5000
--save_total_limit 1
--weight_decay 0.
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1 --tf32 True
--model_max_length 4096
--gradient_checkpointing True
--dataloader_num_workers 4
--report_to none
--per_device_train_batch_size 4
--num_train_epochs 1
--learning_rate 5e-5
--datasets data_name1+data_name2+data_name3
--output_dir /path/to/output/
```
2.
For Vary-tiny
```
Shell
deepspeed vary/train/train_opt.py --deepspeed /vary/zero_config/zero2.json
--model_name_or_path /opt125m/path/
--conversation_version opt
--freeze_vision_tower False
--freeze_lm_model False
--use_im_start_end True
--bf16 True
--per_device_eval_batch_size 4
--gradient_accumulation_steps 1
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 5000
--save_total_limit 1
--weight_decay 0.
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1 --tf32 True
--model_max_length 4096
--gradient_checkpointing True
--dataloader_num_workers 4
--report_to none
--per_device_train_batch_size 16
--num_train_epochs 1
--learning_rate 5e-5
--datasets data_name1+data_name2+data_name3
--output_dir /path/to/output/
```
## 推理
## 推理
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment