Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
LLaMA-Factory-Llama3.2_pytorch
Commits
12d5cbac
Commit
12d5cbac
authored
Oct 21, 2024
by
chenzk
Browse files
v1.0
parents
Pipeline
#1780
canceled with stages
Changes
259
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
776 additions
and
0 deletions
+776
-0
examples/README_zh.md
examples/README_zh.md
+240
-0
examples/accelerate/fsdp_config.yaml
examples/accelerate/fsdp_config.yaml
+25
-0
examples/deepspeed/ds_z0_config.json
examples/deepspeed/ds_z0_config.json
+29
-0
examples/deepspeed/ds_z2_config.json
examples/deepspeed/ds_z2_config.json
+29
-0
examples/deepspeed/ds_z2_offload_config.json
examples/deepspeed/ds_z2_offload_config.json
+33
-0
examples/deepspeed/ds_z3_config.json
examples/deepspeed/ds_z3_config.json
+31
-0
examples/deepspeed/ds_z3_offload_config.json
examples/deepspeed/ds_z3_offload_config.json
+39
-0
examples/extras/adam_mini/qwen2_full_sft.yaml
examples/extras/adam_mini/qwen2_full_sft.yaml
+39
-0
examples/extras/badam/llama3_full_sft.yaml
examples/extras/badam/llama3_full_sft.yaml
+42
-0
examples/extras/fsdp_qlora/llama3_lora_sft.yaml
examples/extras/fsdp_qlora/llama3_lora_sft.yaml
+40
-0
examples/extras/fsdp_qlora/train.sh
examples/extras/fsdp_qlora/train.sh
+6
-0
examples/extras/galore/llama3_full_sft.yaml
examples/extras/galore/llama3_full_sft.yaml
+43
-0
examples/extras/llama_pro/expand.sh
examples/extras/llama_pro/expand.sh
+6
-0
examples/extras/llama_pro/llama3_freeze_sft.yaml
examples/extras/llama_pro/llama3_freeze_sft.yaml
+41
-0
examples/extras/loraplus/llama3_lora_sft.yaml
examples/extras/loraplus/llama3_lora_sft.yaml
+40
-0
examples/extras/mod/llama3_full_sft.yaml
examples/extras/mod/llama3_full_sft.yaml
+40
-0
examples/extras/pissa/init.sh
examples/extras/pissa/init.sh
+5
-0
examples/extras/pissa/llama3_lora_sft.yaml
examples/extras/pissa/llama3_lora_sft.yaml
+42
-0
examples/inference/llama3.yaml
examples/inference/llama3.yaml
+2
-0
examples/inference/llama3_lora_sft.yaml
examples/inference/llama3_lora_sft.yaml
+4
-0
No files found.
examples/README_zh.md
0 → 100644
View file @
12d5cbac
我们提供了多样化的大模型微调示例脚本。
请确保在
`LLaMA-Factory`
目录下执行下述命令。
## 目录
-
[
LoRA 微调
](
#lora-微调
)
-
[
QLoRA 微调
](
#qlora-微调
)
-
[
全参数微调
](
#全参数微调
)
-
[
合并 LoRA 适配器与模型量化
](
#合并-lora-适配器与模型量化
)
-
[
推理 LoRA 模型
](
#推理-lora-模型
)
-
[
杂项
](
#杂项
)
使用
`CUDA_VISIBLE_DEVICES`
(GPU)或
`ASCEND_RT_VISIBLE_DEVICES`
(NPU)选择计算设备。
## 示例
### LoRA 微调
#### (增量)预训练
```
bash
llamafactory-cli train examples/train_lora/llama3_lora_pretrain.yaml
```
#### 指令监督微调
```
bash
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
```
#### 多模态指令监督微调
```
bash
llamafactory-cli train examples/train_lora/llava1_5_lora_sft.yaml
llamafactory-cli train examples/train_lora/qwen2vl_lora_sft.yaml
```
#### DPO/ORPO/SimPO 训练
```
bash
llamafactory-cli train examples/train_lora/llama3_lora_dpo.yaml
```
#### 多模态 DPO/ORPO/SimPO 训练
```
bash
llamafactory-cli train examples/train_lora/qwen2vl_lora_dpo.yaml
```
#### 奖励模型训练
```
bash
llamafactory-cli train examples/train_lora/llama3_lora_reward.yaml
```
#### PPO 训练
```
bash
llamafactory-cli train examples/train_lora/llama3_lora_ppo.yaml
```
#### KTO 训练
```
bash
llamafactory-cli train examples/train_lora/llama3_lora_kto.yaml
```
#### 预处理数据集
对于大数据集有帮助,在配置中使用
`tokenized_path`
以加载预处理后的数据集。
```
bash
llamafactory-cli train examples/train_lora/llama3_preprocess.yaml
```
#### 在 MMLU/CMMLU/C-Eval 上评估
```
bash
llamafactory-cli
eval
examples/train_lora/llama3_lora_eval.yaml
```
#### 批量预测并计算 BLEU 和 ROUGE 分数
```
bash
llamafactory-cli train examples/train_lora/llama3_lora_predict.yaml
```
#### 多机指令监督微调
```
bash
FORCE_TORCHRUN
=
1
NNODES
=
2
RANK
=
0
MASTER_ADDR
=
192.168.0.1
MASTER_PORT
=
29500 llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
FORCE_TORCHRUN
=
1
NNODES
=
2
RANK
=
1
MASTER_ADDR
=
192.168.0.1
MASTER_PORT
=
29500 llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
```
#### 使用 DeepSpeed ZeRO-3 平均分配显存
```
bash
FORCE_TORCHRUN
=
1 llamafactory-cli train examples/train_lora/llama3_lora_sft_ds3.yaml
```
### QLoRA 微调
#### 基于 4/8 比特 Bitsandbytes/HQQ/EETQ 量化进行指令监督微调(推荐)
```
bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_otfq.yaml
```
#### 基于 4/8 比特 GPTQ 量化进行指令监督微调
```
bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_gptq.yaml
```
#### 基于 4 比特 AWQ 量化进行指令监督微调
```
bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_awq.yaml
```
#### 基于 2 比特 AQLM 量化进行指令监督微调
```
bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_aqlm.yaml
```
### 全参数微调
#### 在单机上进行指令监督微调
```
bash
FORCE_TORCHRUN
=
1 llamafactory-cli train examples/train_full/llama3_full_sft_ds3.yaml
```
#### 在多机上进行指令监督微调
```
bash
FORCE_TORCHRUN
=
1
NNODES
=
2
RANK
=
0
MASTER_ADDR
=
192.168.0.1
MASTER_PORT
=
29500 llamafactory-cli train examples/train_full/llama3_full_sft_ds3.yaml
FORCE_TORCHRUN
=
1
NNODES
=
2
RANK
=
1
MASTER_ADDR
=
192.168.0.1
MASTER_PORT
=
29500 llamafactory-cli train examples/train_full/llama3_full_sft_ds3.yaml
```
#### 多模态指令监督微调
```
bash
FORCE_TORCHRUN
=
1 llamafactory-cli train examples/train_full/qwen2vl_full_sft.yaml
```
#### 批量预测并计算 BLEU 和 ROUGE 分数
```
bash
llamafactory-cli train examples/train_full/llama3_full_predict.yaml
```
### 合并 LoRA 适配器与模型量化
#### 合并 LoRA 适配器
注:请勿使用量化后的模型或
`quantization_bit`
参数来合并 LoRA 适配器。
```
bash
llamafactory-cli
export
examples/merge_lora/llama3_lora_sft.yaml
```
#### 使用 AutoGPTQ 量化模型
```
bash
llamafactory-cli
export
examples/merge_lora/llama3_gptq.yaml
```
### 推理 LoRA 模型
#### 使用命令行接口
```
bash
llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
```
#### 使用浏览器界面
```
bash
llamafactory-cli webchat examples/inference/llama3_lora_sft.yaml
```
#### 启动 OpenAI 风格 API
```
bash
llamafactory-cli api examples/inference/llama3_lora_sft.yaml
```
### 杂项
#### 使用 GaLore 进行全参数训练
```
bash
llamafactory-cli train examples/extras/galore/llama3_full_sft.yaml
```
#### 使用 BAdam 进行全参数训练
```
bash
llamafactory-cli train examples/extras/badam/llama3_full_sft.yaml
```
#### 使用 Adam-mini 进行全参数训练
```
bash
llamafactory-cli train examples/extras/adam_mini/qwen2_full_sft.yaml
```
#### LoRA+ 微调
```
bash
llamafactory-cli train examples/extras/loraplus/llama3_lora_sft.yaml
```
#### PiSSA 微调
```
bash
llamafactory-cli train examples/extras/pissa/llama3_lora_sft.yaml
```
#### 深度混合微调
```
bash
llamafactory-cli train examples/extras/mod/llama3_full_sft.yaml
```
#### LLaMA-Pro 微调
```
bash
bash examples/extras/llama_pro/expand.sh
llamafactory-cli train examples/extras/llama_pro/llama3_freeze_sft.yaml
```
#### FSDP+QLoRA 微调
```
bash
bash examples/extras/fsdp_qlora/train.sh
```
examples/accelerate/fsdp_config.yaml
0 → 100644
View file @
12d5cbac
compute_environment
:
LOCAL_MACHINE
debug
:
false
distributed_type
:
FSDP
downcast_bf16
:
'
no'
fsdp_config
:
fsdp_auto_wrap_policy
:
TRANSFORMER_BASED_WRAP
fsdp_backward_prefetch
:
BACKWARD_PRE
fsdp_forward_prefetch
:
false
fsdp_cpu_ram_efficient_loading
:
true
fsdp_offload_params
:
true
# offload may affect training speed
fsdp_sharding_strategy
:
FULL_SHARD
fsdp_state_dict_type
:
FULL_STATE_DICT
fsdp_sync_module_states
:
true
fsdp_use_orig_params
:
true
machine_rank
:
0
main_training_function
:
main
mixed_precision
:
fp16
# or bf16
num_machines
:
1
# the number of nodes
num_processes
:
2
# the number of GPUs in all nodes
rdzv_backend
:
static
same_network
:
true
tpu_env
:
[]
tpu_use_cluster
:
false
tpu_use_sudo
:
false
use_cpu
:
false
examples/deepspeed/ds_z0_config.json
0 → 100644
View file @
12d5cbac
{
"train_batch_size"
:
"auto"
,
"train_micro_batch_size_per_gpu"
:
"auto"
,
"gradient_accumulation_steps"
:
"auto"
,
"gradient_clipping"
:
"auto"
,
"zero_allow_untested_optimizer"
:
true
,
"fp16"
:
{
"enabled"
:
"auto"
,
"loss_scale"
:
0
,
"loss_scale_window"
:
1000
,
"initial_scale_power"
:
16
,
"hysteresis"
:
2
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
"auto"
},
"zero_optimization"
:
{
"stage"
:
0
,
"allgather_partitions"
:
true
,
"allgather_bucket_size"
:
5e8
,
"overlap_comm"
:
true
,
"reduce_scatter"
:
true
,
"reduce_bucket_size"
:
5e8
,
"contiguous_gradients"
:
true
,
"round_robin_gradients"
:
true
}
}
\ No newline at end of file
examples/deepspeed/ds_z2_config.json
0 → 100644
View file @
12d5cbac
{
"train_batch_size"
:
"auto"
,
"train_micro_batch_size_per_gpu"
:
"auto"
,
"gradient_accumulation_steps"
:
"auto"
,
"gradient_clipping"
:
"auto"
,
"zero_allow_untested_optimizer"
:
true
,
"fp16"
:
{
"enabled"
:
"auto"
,
"loss_scale"
:
0
,
"loss_scale_window"
:
1000
,
"initial_scale_power"
:
16
,
"hysteresis"
:
2
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
"auto"
},
"zero_optimization"
:
{
"stage"
:
2
,
"allgather_partitions"
:
true
,
"allgather_bucket_size"
:
5e8
,
"overlap_comm"
:
true
,
"reduce_scatter"
:
true
,
"reduce_bucket_size"
:
5e8
,
"contiguous_gradients"
:
true
,
"round_robin_gradients"
:
true
}
}
\ No newline at end of file
examples/deepspeed/ds_z2_offload_config.json
0 → 100644
View file @
12d5cbac
{
"train_batch_size"
:
"auto"
,
"train_micro_batch_size_per_gpu"
:
"auto"
,
"gradient_accumulation_steps"
:
"auto"
,
"gradient_clipping"
:
"auto"
,
"zero_allow_untested_optimizer"
:
true
,
"fp16"
:
{
"enabled"
:
"auto"
,
"loss_scale"
:
0
,
"loss_scale_window"
:
1000
,
"initial_scale_power"
:
16
,
"hysteresis"
:
2
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
"auto"
},
"zero_optimization"
:
{
"stage"
:
2
,
"offload_optimizer"
:
{
"device"
:
"cpu"
,
"pin_memory"
:
true
},
"allgather_partitions"
:
true
,
"allgather_bucket_size"
:
5e8
,
"overlap_comm"
:
true
,
"reduce_scatter"
:
true
,
"reduce_bucket_size"
:
5e8
,
"contiguous_gradients"
:
true
,
"round_robin_gradients"
:
true
}
}
\ No newline at end of file
examples/deepspeed/ds_z3_config.json
0 → 100644
View file @
12d5cbac
{
"train_batch_size"
:
"auto"
,
"train_micro_batch_size_per_gpu"
:
"auto"
,
"gradient_accumulation_steps"
:
"auto"
,
"gradient_clipping"
:
"auto"
,
"zero_allow_untested_optimizer"
:
true
,
"fp16"
:
{
"enabled"
:
"auto"
,
"loss_scale"
:
0
,
"loss_scale_window"
:
1000
,
"initial_scale_power"
:
16
,
"hysteresis"
:
2
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
"auto"
},
"zero_optimization"
:
{
"stage"
:
3
,
"overlap_comm"
:
true
,
"contiguous_gradients"
:
true
,
"sub_group_size"
:
1e9
,
"reduce_bucket_size"
:
"auto"
,
"stage3_prefetch_bucket_size"
:
"auto"
,
"stage3_param_persistence_threshold"
:
"auto"
,
"stage3_max_live_parameters"
:
1e9
,
"stage3_max_reuse_distance"
:
1e9
,
"stage3_gather_16bit_weights_on_model_save"
:
true
}
}
\ No newline at end of file
examples/deepspeed/ds_z3_offload_config.json
0 → 100644
View file @
12d5cbac
{
"train_batch_size"
:
"auto"
,
"train_micro_batch_size_per_gpu"
:
"auto"
,
"gradient_accumulation_steps"
:
"auto"
,
"gradient_clipping"
:
"auto"
,
"zero_allow_untested_optimizer"
:
true
,
"fp16"
:
{
"enabled"
:
"auto"
,
"loss_scale"
:
0
,
"loss_scale_window"
:
1000
,
"initial_scale_power"
:
16
,
"hysteresis"
:
2
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
"auto"
},
"zero_optimization"
:
{
"stage"
:
3
,
"offload_optimizer"
:
{
"device"
:
"cpu"
,
"pin_memory"
:
true
},
"offload_param"
:
{
"device"
:
"cpu"
,
"pin_memory"
:
true
},
"overlap_comm"
:
true
,
"contiguous_gradients"
:
true
,
"sub_group_size"
:
1e9
,
"reduce_bucket_size"
:
"auto"
,
"stage3_prefetch_bucket_size"
:
"auto"
,
"stage3_param_persistence_threshold"
:
"auto"
,
"stage3_max_live_parameters"
:
1e9
,
"stage3_max_reuse_distance"
:
1e9
,
"stage3_gather_16bit_weights_on_model_save"
:
true
}
}
\ No newline at end of file
examples/extras/adam_mini/qwen2_full_sft.yaml
0 → 100644
View file @
12d5cbac
### model
model_name_or_path
:
Qwen/Qwen2-1.5B-Instruct
### method
stage
:
sft
do_train
:
true
finetuning_type
:
full
use_adam_mini
:
true
### dataset
dataset
:
identity,alpaca_en_demo
template
:
qwen
cutoff_len
:
1024
max_samples
:
1000
overwrite_cache
:
true
preprocessing_num_workers
:
16
### output
output_dir
:
saves/qwen2-1_5b/full/sft
logging_steps
:
10
save_steps
:
500
plot_loss
:
true
overwrite_output_dir
:
true
### train
per_device_train_batch_size
:
1
gradient_accumulation_steps
:
8
learning_rate
:
1.0e-5
num_train_epochs
:
3.0
lr_scheduler_type
:
cosine
warmup_ratio
:
0.1
bf16
:
true
ddp_timeout
:
180000000
### eval
val_size
:
0.1
per_device_eval_batch_size
:
1
eval_strategy
:
steps
eval_steps
:
500
examples/extras/badam/llama3_full_sft.yaml
0 → 100644
View file @
12d5cbac
### model
model_name_or_path
:
meta-llama/Meta-Llama-3-8B-Instruct
### method
stage
:
sft
do_train
:
true
finetuning_type
:
full
use_badam
:
true
badam_mode
:
layer
badam_switch_mode
:
ascending
badam_switch_interval
:
50
badam_verbose
:
2
# deepspeed: examples/deepspeed/ds_z3_config.json
### dataset
dataset
:
identity,alpaca_en_demo
template
:
llama3
cutoff_len
:
1024
max_samples
:
1000
overwrite_cache
:
true
preprocessing_num_workers
:
16
### output
output_dir
:
saves/llama3-8b/full/sft
logging_steps
:
10
save_steps
:
500
plot_loss
:
true
overwrite_output_dir
:
true
### train
per_device_train_batch_size
:
1
gradient_accumulation_steps
:
8
learning_rate
:
1.0e-5
num_train_epochs
:
3.0
lr_scheduler_type
:
cosine
warmup_ratio
:
0.1
### eval
val_size
:
0.1
per_device_eval_batch_size
:
1
eval_strategy
:
steps
eval_steps
:
500
examples/extras/fsdp_qlora/llama3_lora_sft.yaml
0 → 100644
View file @
12d5cbac
### model
model_name_or_path
:
meta-llama/Meta-Llama-3-8B-Instruct
quantization_bit
:
4
### method
stage
:
sft
do_train
:
true
finetuning_type
:
lora
lora_target
:
all
### dataset
dataset
:
identity,alpaca_en_demo
template
:
llama3
cutoff_len
:
1024
max_samples
:
1000
overwrite_cache
:
true
preprocessing_num_workers
:
16
### output
output_dir
:
saves/llama3-8b/lora/sft
logging_steps
:
10
save_steps
:
500
plot_loss
:
true
overwrite_output_dir
:
true
### train
per_device_train_batch_size
:
1
gradient_accumulation_steps
:
8
learning_rate
:
1.0e-4
num_train_epochs
:
3.0
lr_scheduler_type
:
cosine
warmup_ratio
:
0.1
bf16
:
true
ddp_timeout
:
180000000
### eval
val_size
:
0.1
per_device_eval_batch_size
:
1
eval_strategy
:
steps
eval_steps
:
500
examples/extras/fsdp_qlora/train.sh
0 → 100644
View file @
12d5cbac
#!/bin/bash
# DO NOT use GPTQ/AWQ model in FSDP+QLoRA
CUDA_VISIBLE_DEVICES
=
0,1 accelerate launch
\
--config_file
examples/accelerate/fsdp_config.yaml
\
src/train.py examples/extras/fsdp_qlora/llama3_lora_sft.yaml
examples/extras/galore/llama3_full_sft.yaml
0 → 100644
View file @
12d5cbac
### model
model_name_or_path
:
meta-llama/Meta-Llama-3-8B-Instruct
### method
stage
:
sft
do_train
:
true
finetuning_type
:
full
use_galore
:
true
galore_layerwise
:
true
galore_target
:
mlp,self_attn
galore_rank
:
128
galore_scale
:
2.0
### dataset
dataset
:
identity,alpaca_en_demo
template
:
llama3
cutoff_len
:
1024
max_samples
:
1000
overwrite_cache
:
true
preprocessing_num_workers
:
16
### output
output_dir
:
saves/llama3-8b/full/sft
logging_steps
:
10
save_steps
:
500
plot_loss
:
true
overwrite_output_dir
:
true
### train
per_device_train_batch_size
:
1
gradient_accumulation_steps
:
1
learning_rate
:
1.0e-5
num_train_epochs
:
3.0
lr_scheduler_type
:
cosine
warmup_ratio
:
0.1
pure_bf16
:
true
ddp_timeout
:
180000000
### eval
val_size
:
0.1
per_device_eval_batch_size
:
1
eval_strategy
:
steps
eval_steps
:
500
examples/extras/llama_pro/expand.sh
0 → 100644
View file @
12d5cbac
#!/bin/bash
python scripts/llama_pro.py
\
--model_name_or_path
meta-llama/Meta-Llama-3-8B-Instruct
\
--output_dir
models/llama3-8b-pro
\
--num_expand
8
examples/extras/llama_pro/llama3_freeze_sft.yaml
0 → 100644
View file @
12d5cbac
### model
model_name_or_path
:
models/llama3-8b-pro
### method
stage
:
sft
do_train
:
true
finetuning_type
:
freeze
freeze_trainable_layers
:
8
freeze_trainable_modules
:
all
use_llama_pro
:
true
### dataset
dataset
:
identity,alpaca_en_demo
template
:
llama3
cutoff_len
:
1024
max_samples
:
1000
overwrite_cache
:
true
preprocessing_num_workers
:
16
### output
output_dir
:
saves/llama3-8b-pro/freeze/sft
logging_steps
:
10
save_steps
:
500
plot_loss
:
true
overwrite_output_dir
:
true
### train
per_device_train_batch_size
:
1
gradient_accumulation_steps
:
8
learning_rate
:
1.0e-4
num_train_epochs
:
3.0
lr_scheduler_type
:
cosine
warmup_ratio
:
0.1
bf16
:
true
ddp_timeout
:
180000000
### eval
val_size
:
0.1
per_device_eval_batch_size
:
1
eval_strategy
:
steps
eval_steps
:
500
examples/extras/loraplus/llama3_lora_sft.yaml
0 → 100644
View file @
12d5cbac
### model
model_name_or_path
:
meta-llama/Meta-Llama-3-8B-Instruct
### method
stage
:
sft
do_train
:
true
finetuning_type
:
lora
lora_target
:
all
loraplus_lr_ratio
:
16.0
### dataset
dataset
:
identity,alpaca_en_demo
template
:
llama3
cutoff_len
:
1024
max_samples
:
1000
overwrite_cache
:
true
preprocessing_num_workers
:
16
### output
output_dir
:
saves/llama3-8b/lora/sft
logging_steps
:
10
save_steps
:
500
plot_loss
:
true
overwrite_output_dir
:
true
### train
per_device_train_batch_size
:
1
gradient_accumulation_steps
:
8
learning_rate
:
1.0e-4
num_train_epochs
:
3.0
lr_scheduler_type
:
cosine
warmup_ratio
:
0.1
bf16
:
true
ddp_timeout
:
180000000
### eval
val_size
:
0.1
per_device_eval_batch_size
:
1
eval_strategy
:
steps
eval_steps
:
500
examples/extras/mod/llama3_full_sft.yaml
0 → 100644
View file @
12d5cbac
### model
model_name_or_path
:
meta-llama/Meta-Llama-3-8B-Instruct
### method
stage
:
sft
do_train
:
true
finetuning_type
:
full
mixture_of_depths
:
convert
### dataset
dataset
:
identity,alpaca_en_demo
template
:
llama3
cutoff_len
:
1024
max_samples
:
1000
overwrite_cache
:
true
preprocessing_num_workers
:
16
### output
output_dir
:
saves/llama3-8b-mod/full/sft
logging_steps
:
10
save_steps
:
500
plot_loss
:
true
overwrite_output_dir
:
true
### train
per_device_train_batch_size
:
1
gradient_accumulation_steps
:
8
optim
:
paged_adamw_8bit
learning_rate
:
1.0e-5
num_train_epochs
:
3.0
lr_scheduler_type
:
cosine
warmup_ratio
:
0.1
pure_bf16
:
true
ddp_timeout
:
180000000
### eval
val_size
:
0.1
per_device_eval_batch_size
:
1
eval_strategy
:
steps
eval_steps
:
500
examples/extras/pissa/init.sh
0 → 100644
View file @
12d5cbac
#!/bin/bash
python scripts/pissa_init.py
\
--model_name_or_path
meta-llama/Meta-Llama-3-8B-Instruct
\
--output_dir
models/llama3-8b-pissa
examples/extras/pissa/llama3_lora_sft.yaml
0 → 100644
View file @
12d5cbac
### model
model_name_or_path
:
meta-llama/Meta-Llama-3-8B-Instruct
### method
stage
:
sft
do_train
:
true
finetuning_type
:
lora
lora_target
:
all
pissa_init
:
true
pissa_iter
:
16
pissa_convert
:
true
### dataset
dataset
:
identity,alpaca_en_demo
template
:
llama3
cutoff_len
:
1024
max_samples
:
1000
overwrite_cache
:
true
preprocessing_num_workers
:
16
### output
output_dir
:
saves/llama3-8b/lora/sft
logging_steps
:
10
save_steps
:
500
plot_loss
:
true
overwrite_output_dir
:
true
### train
per_device_train_batch_size
:
1
gradient_accumulation_steps
:
8
learning_rate
:
1.0e-4
num_train_epochs
:
3.0
lr_scheduler_type
:
cosine
warmup_ratio
:
0.1
bf16
:
true
ddp_timeout
:
180000000
### eval
val_size
:
0.1
per_device_eval_batch_size
:
1
eval_strategy
:
steps
eval_steps
:
500
examples/inference/llama3.yaml
0 → 100644
View file @
12d5cbac
model_name_or_path
:
meta-llama/Meta-Llama-3-8B-Instruct
template
:
llama3
examples/inference/llama3_lora_sft.yaml
0 → 100644
View file @
12d5cbac
model_name_or_path
:
meta-llama/Llama-3.2-3B-Instruct
adapter_name_or_path
:
saves/Llama-3.2-3B/lora/sft
template
:
llama3
finetuning_type
:
lora
Prev
1
2
3
4
5
6
7
8
…
13
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment