[deepspeed] check whether model is NLP one instead of counting on input type (#21800)

* trying to figure out whether model is NLP * drop my changes and apply easier fix * trying to handle all int input types * fix logic --------- Co-authored-by: Stas Bekman <stas@stason.org>

[deepspeed] check whether model is NLP one instead of counting on input type (#21800)
* trying to figure out whether model is NLP * drop my changes and apply easier fix * trying to handle all int input types * fix logic --------- Co-authored-by: Stas Bekman <stas@stason.org>
f71873c5 · Eugene Zapolsky · GitHub · 72e9ca75 · f71873c5
Unverified Commit f71873c5 authored Mar 01, 2023 by Eugene Zapolsky Committed by GitHub Mar 01, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 2 additions and 2 deletions

src/transformers/trainer.py src/transformers/trainer.py +2 -2

No files found.
--- a/src/transformers/trainer.py
+++ b/src/transformers/trainer.py
@@ -2562,8 +2562,8 @@ class Trainer:
            return type(data)(self._prepare_input(v) for v in data)
        elif isinstance(data, torch.Tensor):
            kwargs = {"device": self.args.device}
-            if self.deepspeed and data.dtype != torch.int64:
+            if self.deepspeed and (torch.is_floating_point(data) or torch.is_complex(data)):
-                # NLP models inputs are int64 and those get adjusted to the right dtype of the
+                # NLP models inputs are int/uint and those get adjusted to the right dtype of the
                # embedding. Other models such as wav2vec2's inputs are already float and thus
                # may need special handling to match the dtypes of the model
                kwargs.update({"dtype": self.args.hf_deepspeed_config.dtype()})