Commit 428e38f1 authored by wxj's avatar wxj
Browse files

Update qwen1.5_14b.sh

parent 9d8e86df
Pipeline #2575 passed with stage
......@@ -99,6 +99,7 @@ TRAINING_ARGS=(
--hidden-dropout 0
# --no-gradient-accumulation-fusion
--swiglu
--add-qkv-bias
--lr 3.0e-5
--lr-decay-style cosine
--min-lr 3.0e-6
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment