Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
LLaMA-Factory
Commits
ca625f43
Commit
ca625f43
authored
Mar 30, 2026
by
shihm
Browse files
uodata
parent
7164651d
Changes
327
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
109 additions
and
0 deletions
+109
-0
examples/megatron/qwen3_moe_full.yaml
examples/megatron/qwen3_moe_full.yaml
+35
-0
examples/merge_lora/qwen3_full_sft.yaml
examples/merge_lora/qwen3_full_sft.yaml
+10
-0
examples/merge_lora/qwen3_gptq.yaml
examples/merge_lora/qwen3_gptq.yaml
+12
-0
examples/merge_lora/qwen3_lora_sft.yaml
examples/merge_lora/qwen3_lora_sft.yaml
+13
-0
examples/merge_lora/qwen3vl_lora_sft.yaml
examples/merge_lora/qwen3vl_lora_sft.yaml
+13
-0
examples/requirements/adam-mini.txt
examples/requirements/adam-mini.txt
+1
-0
examples/requirements/apollo.txt
examples/requirements/apollo.txt
+1
-0
examples/requirements/aqlm.txt
examples/requirements/aqlm.txt
+1
-0
examples/requirements/badam.txt
examples/requirements/badam.txt
+1
-0
examples/requirements/bitsandbytes.txt
examples/requirements/bitsandbytes.txt
+1
-0
examples/requirements/eetq.txt
examples/requirements/eetq.txt
+1
-0
examples/requirements/fp8-te.txt
examples/requirements/fp8-te.txt
+2
-0
examples/requirements/fp8.txt
examples/requirements/fp8.txt
+2
-0
examples/requirements/galore.txt
examples/requirements/galore.txt
+1
-0
examples/requirements/gptq.txt
examples/requirements/gptq.txt
+2
-0
examples/requirements/hqq.txt
examples/requirements/hqq.txt
+1
-0
examples/requirements/liger-kernel.txt
examples/requirements/liger-kernel.txt
+1
-0
examples/requirements/minicpm-v.txt
examples/requirements/minicpm-v.txt
+8
-0
examples/requirements/openmind.txt
examples/requirements/openmind.txt
+1
-0
examples/requirements/sglang.txt
examples/requirements/sglang.txt
+2
-0
No files found.
examples/megatron/qwen3_moe_full.yaml
0 → 100644
View file @
ca625f43
model_name_or_path
:
Qwen/Qwen3-30B-A3B-Instruct-2507
# GPU memory: 8 * 78GB
do_train
:
true
stage
:
sft
finetuning_type
:
full
# only support full for now
dataset
:
alpaca_en_demo
preprocessing_num_workers
:
8
cutoff_len
:
4096
template
:
qwen3_nothink
# global batchsize = (8 // 2 // 4) * 8 = 8
output_dir
:
saves/mca/qwen3_moe_full
per_device_train_batch_size
:
1
gradient_accumulation_steps
:
8
num_train_epochs
:
2
learning_rate
:
3e-6
logging_steps
:
1
save_steps
:
100
lr_scheduler_type
:
constant
bf16
:
true
# mcore speed up
tensor_model_parallel_size
:
1
sequence_parallel
:
false
pipeline_model_parallel_size
:
4
bias_activation_fusion
:
true
apply_rope_fusion
:
true
use_distributed_optimizer
:
true
overlap_param_gather
:
true
overlap_grad_reduce
:
true
moe_grouped_gemm
:
true
moe_token_dispatcher_type
:
alltoall
expert_model_parallel_size
:
2
recompute_granularity
:
full
examples/merge_lora/qwen3_full_sft.yaml
0 → 100644
View file @
ca625f43
### model
model_name_or_path
:
saves/qwen3-4b/full/sft
template
:
qwen3_nothink
trust_remote_code
:
true
### export
export_dir
:
saves/qwen3_sft_merged
export_size
:
5
export_device
:
cpu
# choices: [cpu, auto]
export_legacy_format
:
false
examples/merge_lora/qwen3_gptq.yaml
0 → 100644
View file @
ca625f43
### model
model_name_or_path
:
Qwen/Qwen3-4B-Instruct-2507
template
:
qwen3_nothink
trust_remote_code
:
true
### export
export_dir
:
saves/qwen3_gptq
export_quantization_bit
:
4
export_quantization_dataset
:
data/c4_demo.jsonl
export_size
:
5
export_device
:
cpu
# choices: [cpu, auto]
export_legacy_format
:
false
examples/merge_lora/qwen3_lora_sft.yaml
0 → 100644
View file @
ca625f43
### Note: DO NOT use quantized model or quantization_bit when merging lora adapters
### model
model_name_or_path
:
Qwen/Qwen3-4B-Instruct-2507
adapter_name_or_path
:
saves/qwen3-4b/lora/sft
template
:
qwen3_nothink
trust_remote_code
:
true
### export
export_dir
:
saves/qwen3_sft_merged
export_size
:
5
export_device
:
cpu
# choices: [cpu, auto]
export_legacy_format
:
false
examples/merge_lora/qwen3vl_lora_sft.yaml
0 → 100644
View file @
ca625f43
### Note: DO NOT use quantized model or quantization_bit when merging lora adapters
### model
model_name_or_path
:
Qwen/Qwen3-VL-4B-Instruct
adapter_name_or_path
:
saves/qwen3-vl-4b/lora/sft
template
:
qwen3_vl_nothink
trust_remote_code
:
true
### export
export_dir
:
saves/qwen3_vl_sft_merged
export_size
:
5
export_device
:
cpu
# choices: [cpu, auto]
export_legacy_format
:
false
examples/requirements/adam-mini.txt
0 → 100644
View file @
ca625f43
adam-mini
examples/requirements/apollo.txt
0 → 100644
View file @
ca625f43
apollo-torch
examples/requirements/aqlm.txt
0 → 100644
View file @
ca625f43
aqlm[gpu]>=1.1.0
examples/requirements/badam.txt
0 → 100644
View file @
ca625f43
badam>=1.2.1
examples/requirements/bitsandbytes.txt
0 → 100644
View file @
ca625f43
bitsandbytes>=0.39.0
examples/requirements/eetq.txt
0 → 100644
View file @
ca625f43
eetq
examples/requirements/fp8-te.txt
0 → 100644
View file @
ca625f43
transformer_engine[pytorch]>=2.0.0
accelerate>=1.10.0
examples/requirements/fp8.txt
0 → 100644
View file @
ca625f43
torchao>=0.8.0
accelerate>=1.10.0
examples/requirements/galore.txt
0 → 100644
View file @
ca625f43
galore-torch
examples/requirements/gptq.txt
0 → 100644
View file @
ca625f43
optimum>=1.24.0
gptqmodel>=2.0.0
examples/requirements/hqq.txt
0 → 100644
View file @
ca625f43
hqq
examples/requirements/liger-kernel.txt
0 → 100644
View file @
ca625f43
liger-kernel>=0.5.5
examples/requirements/minicpm-v.txt
0 → 100644
View file @
ca625f43
soundfile
torchvision
torchaudio
vector_quantize_pytorch
vocos
msgpack
referencing
jsonschema_specifications
examples/requirements/openmind.txt
0 → 100644
View file @
ca625f43
openmind
examples/requirements/sglang.txt
0 → 100644
View file @
ca625f43
sglang[srt]>=0.4.5
transformers==4.51.1
Prev
1
2
3
4
5
6
7
8
9
…
17
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment