Commit 428e38f1 authored by wxj's avatar wxj
Browse files

Update qwen1.5_14b.sh

parent 9d8e86df
Pipeline #2575 passed with stage
...@@ -99,6 +99,7 @@ TRAINING_ARGS=( ...@@ -99,6 +99,7 @@ TRAINING_ARGS=(
--hidden-dropout 0 --hidden-dropout 0
# --no-gradient-accumulation-fusion # --no-gradient-accumulation-fusion
--swiglu --swiglu
--add-qkv-bias
--lr 3.0e-5 --lr 3.0e-5
--lr-decay-style cosine --lr-decay-style cosine
--min-lr 3.0e-6 --min-lr 3.0e-6
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment