Unverified Commit ea5da52e authored by Nicholas Broad's avatar Nicholas Broad Committed by GitHub
Browse files

add values for neftune (#32399)

I always forget what typical values are, and I have to look at the paper everytime. This will be a helpful reminder.
parent 3d7c2f9d
...@@ -770,7 +770,7 @@ class TrainingArguments: ...@@ -770,7 +770,7 @@ class TrainingArguments:
If not `None`, this will activate NEFTune noise embeddings. This can drastically improve model performance If not `None`, this will activate NEFTune noise embeddings. This can drastically improve model performance
for instruction fine-tuning. Check out the [original paper](https://arxiv.org/abs/2310.05914) and the for instruction fine-tuning. Check out the [original paper](https://arxiv.org/abs/2310.05914) and the
[original code](https://github.com/neelsjain/NEFTune). Support transformers `PreTrainedModel` and also [original code](https://github.com/neelsjain/NEFTune). Support transformers `PreTrainedModel` and also
`PeftModel` from peft. `PeftModel` from peft. The original paper used values in the range [5.0, 15.0].
optim_target_modules (`Union[str, List[str]]`, *optional*): optim_target_modules (`Union[str, List[str]]`, *optional*):
The target modules to optimize, i.e. the module names that you would like to train, right now this is used only for GaLore algorithm The target modules to optimize, i.e. the module names that you would like to train, right now this is used only for GaLore algorithm
https://arxiv.org/abs/2403.03507 https://arxiv.org/abs/2403.03507
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment