Unverified Commit b2c63c79 authored by Quentin Gallouédec's avatar Quentin Gallouédec Committed by GitHub
Browse files

Fix `max_steps` documentation regarding the end-of-training condition (#27624)



* fix max_steps doc

* Update src/transformers/training_args.py [ci skip]
Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>

* propagate suggested change

---------
Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
parent c651eb23
......@@ -234,8 +234,8 @@ class TrainingArguments:
the last epoch before stopping training).
max_steps (`int`, *optional*, defaults to -1):
If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
In case of using a finite iterable dataset the training may stop before reaching the set number of steps
when all data is exhausted
For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
`max_steps` is reached.
lr_scheduler_type (`str` or [`SchedulerType`], *optional*, defaults to `"linear"`):
The scheduler type to use. See the documentation of [`SchedulerType`] for all possible values.
lr_scheduler_kwargs ('dict', *optional*, defaults to {}):
......@@ -2181,9 +2181,9 @@ class TrainingArguments:
Total number of training epochs to perform (if not an integer, will perform the decimal part percents
of the last epoch before stopping training).
max_steps (`int`, *optional*, defaults to -1):
If set to a positive number, the total number of training steps to perform. Overrides
`num_train_epochs`. In case of using a finite iterable dataset the training may stop before reaching
the set number of steps when all data is exhausted.
If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
`max_steps` is reached.
gradient_accumulation_steps (`int`, *optional*, defaults to 1):
Number of updates steps to accumulate the gradients for, before performing a backward/update pass.
......@@ -2588,9 +2588,9 @@ class TrainingArguments:
Total number of training epochs to perform (if not an integer, will perform the decimal part percents
of the last epoch before stopping training).
max_steps (`int`, *optional*, defaults to -1):
If set to a positive number, the total number of training steps to perform. Overrides
`num_train_epochs`. In case of using a finite iterable dataset the training may stop before reaching
the set number of steps when all data is exhausted.
If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
`max_steps` is reached.
warmup_ratio (`float`, *optional*, defaults to 0.0):
Ratio of total training steps used for a linear warmup from 0 to `learning_rate`.
warmup_steps (`int`, *optional*, defaults to 0):
......
......@@ -92,6 +92,8 @@ class TFTrainingArguments(TrainingArguments):
Total number of training epochs to perform.
max_steps (`int`, *optional*, defaults to -1):
If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
`max_steps` is reached.
warmup_ratio (`float`, *optional*, defaults to 0.0):
Ratio of total training steps used for a linear warmup from 0 to `learning_rate`.
warmup_steps (`int`, *optional*, defaults to 0):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment