Fix: Disable torch.autocast in RotaryEmbedding of Gemma and LLaMa for MPS device (#29439)

* Fix: Disable torch.autocast in RotaryEmbedding of Gemma and LLaMa for MPS devices * Update src/transformers/models/gemma/modeling_gemma.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update llama ang gemma rope use cpu in mps device --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Fix: Disable torch.autocast in RotaryEmbedding of Gemma and LLaMa for MPS device (#29439)
* Fix: Disable torch.autocast in RotaryEmbedding of Gemma and LLaMa for MPS devices * Update src/transformers/models/gemma/modeling_gemma.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update llama ang gemma rope use cpu in mps device --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
d45f47ab · Park Jun · GitHub · 2a939f20 · d45f47ab · d45f47ab
Unverified Commit d45f47ab authored Mar 07, 2024 by Park Jun Committed by GitHub Mar 07, 2024
Showing with 2 additions and 2 deletions

src/transformers/models/gemma/modeling_gemma.py src/transformers/models/gemma/modeling_gemma.py +1 -1

src/transformers/models/llama/modeling_llama.py src/transformers/models/llama/modeling_llama.py +1 -1

No files found.
--- a/src/transformers/models/gemma/modeling_gemma.py
+++ b/src/transformers/models/gemma/modeling_gemma.py
@@ -115,7 +115,7 @@ class GemmaRotaryEmbedding(nn.Module):
        # Force float32 since bfloat16 loses precision on long contexts
        # See https://github.com/huggingface/transformers/pull/29285
        device_type = x.device.type
-        device_type = device_type if isinstance(device_type, str) else "cpu"
+        device_type = device_type if isinstance(device_type, str) and device_type != "mps" else "cpu"
        with torch.autocast(device_type=device_type, enabled=False):
            freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(1, 2)
            emb = torch.cat((freqs, freqs), dim=-1)

--- a/src/transformers/models/llama/modeling_llama.py
+++ b/src/transformers/models/llama/modeling_llama.py
@@ -139,7 +139,7 @@ class LlamaRotaryEmbedding(nn.Module):
        # Force float32 since bfloat16 loses precision on long contexts
        # See https://github.com/huggingface/transformers/pull/29285
        device_type = x.device.type
-        device_type = device_type if isinstance(device_type, str) else "cpu"
+        device_type = device_type if isinstance(device_type, str) and device_type != "mps" else "cpu"
        with torch.autocast(device_type=device_type, enabled=False):
            freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(1, 2)
            emb = torch.cat((freqs, freqs), dim=-1)