Fix `max_steps` documentation regarding the end-of-training condition (#27624)

* fix max_steps doc * Update src/transformers/training_args.py [ci skip] Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * propagate suggested change --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

Fix `max_steps` documentation regarding the end-of-training condition (#27624)
* fix max_steps doc * Update src/transformers/training_args.py [ci skip] Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * propagate suggested change --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
b2c63c79 · Quentin Gallouédec · GitHub · c651eb23 · b2c63c79 · b2c63c79
Unverified Commit b2c63c79 authored Nov 22, 2023 by Quentin Gallouédec Committed by GitHub Nov 22, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 10 additions and 8 deletions

src/transformers/training_args.py src/transformers/training_args.py +8 -8

src/transformers/training_args_tf.py src/transformers/training_args_tf.py +2 -0

No files found.
--- a/src/transformers/training_args.py
+++ b/src/transformers/training_args.py
@@ -234,8 +234,8 @@ class TrainingArguments:
            the last epoch before stopping training).
        max_steps (`int`, *optional*, defaults to -1):
            If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
-            In case of using a finite iterable dataset the training may stop before reaching the set number of steps
-            when all data is exhausted
+            For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
+            `max_steps` is reached.
        lr_scheduler_type (`str` or [`SchedulerType`], *optional*, defaults to `"linear"`):
            The scheduler type to use. See the documentation of [`SchedulerType`] for all possible values.
        lr_scheduler_kwargs ('dict', *optional*, defaults to {}):
@@ -2181,9 +2181,9 @@ class TrainingArguments:
                Total number of training epochs to perform (if not an integer, will perform the decimal part percents
                of the last epoch before stopping training).
            max_steps (`int`, *optional*, defaults to -1):
-                If set to a positive number, the total number of training steps to perform. Overrides
-                `num_train_epochs`. In case of using a finite iterable dataset the training may stop before reaching
-                the set number of steps when all data is exhausted.
+                If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
+                For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
+                `max_steps` is reached.
            gradient_accumulation_steps (`int`, *optional*, defaults to 1):
                Number of updates steps to accumulate the gradients for, before performing a backward/update pass.

@@ -2588,9 +2588,9 @@ class TrainingArguments:
                Total number of training epochs to perform (if not an integer, will perform the decimal part percents
                of the last epoch before stopping training).
            max_steps (`int`, *optional*, defaults to -1):
-                If set to a positive number, the total number of training steps to perform. Overrides
-                `num_train_epochs`. In case of using a finite iterable dataset the training may stop before reaching
-                the set number of steps when all data is exhausted.
+                If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
+                For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
+                `max_steps` is reached.
            warmup_ratio (`float`, *optional*, defaults to 0.0):
                Ratio of total training steps used for a linear warmup from 0 to `learning_rate`.
            warmup_steps (`int`, *optional*, defaults to 0):

--- a/src/transformers/training_args_tf.py
+++ b/src/transformers/training_args_tf.py
@@ -92,6 +92,8 @@ class TFTrainingArguments(TrainingArguments):
            Total number of training epochs to perform.
        max_steps (`int`, *optional*, defaults to -1):
            If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`.
+            For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until
+            `max_steps` is reached.
        warmup_ratio (`float`, *optional*, defaults to 0.0):
            Ratio of total training steps used for a linear warmup from 0 to `learning_rate`.
        warmup_steps (`int`, *optional*, defaults to 0):