Unverified Commit b2c63c79 authored by Quentin Gallouédec's avatar Quentin Gallouédec Committed by GitHub
Browse files

Fix `max_steps` documentation regarding the end-of-training condition (#27624)



* fix max_steps doc

* Update src/transformers/training_args.py [ci skip]
Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>

* propagate suggested change

---------
Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
parent c651eb23
...@@ -234,8 +234,8 @@ class TrainingArguments: ...@@ -234,8 +234,8 @@ class TrainingArguments:
the last epoch before stopping training). the last epoch before stopping training).
max_steps (`int`, *optional*, defaults to -1): max_steps (`int`, *optional*, defaults to -1):
If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`. If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
In case of using a finite iterable dataset the training may stop before reaching the set number of steps For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
when all data is exhausted `max_steps` is reached.
lr_scheduler_type (`str` or [`SchedulerType`], *optional*, defaults to `"linear"`): lr_scheduler_type (`str` or [`SchedulerType`], *optional*, defaults to `"linear"`):
The scheduler type to use. See the documentation of [`SchedulerType`] for all possible values. The scheduler type to use. See the documentation of [`SchedulerType`] for all possible values.
lr_scheduler_kwargs ('dict', *optional*, defaults to {}): lr_scheduler_kwargs ('dict', *optional*, defaults to {}):
...@@ -2181,9 +2181,9 @@ class TrainingArguments: ...@@ -2181,9 +2181,9 @@ class TrainingArguments:
Total number of training epochs to perform (if not an integer, will perform the decimal part percents Total number of training epochs to perform (if not an integer, will perform the decimal part percents
of the last epoch before stopping training). of the last epoch before stopping training).
max_steps (`int`, *optional*, defaults to -1): max_steps (`int`, *optional*, defaults to -1):
If set to a positive number, the total number of training steps to perform. Overrides If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
`num_train_epochs`. In case of using a finite iterable dataset the training may stop before reaching For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
the set number of steps when all data is exhausted. `max_steps` is reached.
gradient_accumulation_steps (`int`, *optional*, defaults to 1): gradient_accumulation_steps (`int`, *optional*, defaults to 1):
Number of updates steps to accumulate the gradients for, before performing a backward/update pass. Number of updates steps to accumulate the gradients for, before performing a backward/update pass.
...@@ -2588,9 +2588,9 @@ class TrainingArguments: ...@@ -2588,9 +2588,9 @@ class TrainingArguments:
Total number of training epochs to perform (if not an integer, will perform the decimal part percents Total number of training epochs to perform (if not an integer, will perform the decimal part percents
of the last epoch before stopping training). of the last epoch before stopping training).
max_steps (`int`, *optional*, defaults to -1): max_steps (`int`, *optional*, defaults to -1):
If set to a positive number, the total number of training steps to perform. Overrides If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
`num_train_epochs`. In case of using a finite iterable dataset the training may stop before reaching For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
the set number of steps when all data is exhausted. `max_steps` is reached.
warmup_ratio (`float`, *optional*, defaults to 0.0): warmup_ratio (`float`, *optional*, defaults to 0.0):
Ratio of total training steps used for a linear warmup from 0 to `learning_rate`. Ratio of total training steps used for a linear warmup from 0 to `learning_rate`.
warmup_steps (`int`, *optional*, defaults to 0): warmup_steps (`int`, *optional*, defaults to 0):
......
...@@ -92,6 +92,8 @@ class TFTrainingArguments(TrainingArguments): ...@@ -92,6 +92,8 @@ class TFTrainingArguments(TrainingArguments):
Total number of training epochs to perform. Total number of training epochs to perform.
max_steps (`int`, *optional*, defaults to -1): max_steps (`int`, *optional*, defaults to -1):
If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`. If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
`max_steps` is reached.
warmup_ratio (`float`, *optional*, defaults to 0.0): warmup_ratio (`float`, *optional*, defaults to 0.0):
Ratio of total training steps used for a linear warmup from 0 to `learning_rate`. Ratio of total training steps used for a linear warmup from 0 to `learning_rate`.
warmup_steps (`int`, *optional*, defaults to 0): warmup_steps (`int`, *optional*, defaults to 0):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment