Fix the behavior of collecting 'num_input_tokens_seen' (#29099)

fix the behavior of collecting 'num_input_tokens_seen' See https://github.com/huggingface/transformers/issues/28791 for more details.

Fix the behavior of collecting 'num_input_tokens_seen' (#29099)
fix the behavior of collecting 'num_input_tokens_seen' See https://github.com/huggingface/transformers/issues/28791 for more details.
afe73aed · yhuang · GitHub · 39114c03 · afe73aed
Unverified Commit afe73aed authored Mar 25, 2024 by yhuang Committed by GitHub Mar 25, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 6 additions and 1 deletion

src/transformers/trainer.py src/transformers/trainer.py +6 -1

No files found.
--- a/src/transformers/trainer.py
+++ b/src/transformers/trainer.py
@@ -2097,7 +2097,12 @@ class Trainer:
                            "a `main_input_name` attribute to the model class you are using."
                        )
                    else:
-                        self.state.num_input_tokens_seen += self.accelerator.gather(inputs[main_input_name]).numel()
+                        input_device = inputs[main_input_name].device
+                        self.state.num_input_tokens_seen += torch.sum(
+                            self.accelerator.gather(
+                                torch.tensor(inputs[main_input_name].numel(), device=input_device, dtype=torch.int64)
+                            )
+                        ).item()
                if rng_to_sync:
                    self._load_rng_state(resume_from_checkpoint)
                    rng_to_sync = False