Change the way tensor is reshaped in BartAttention (from .view to .reshape) (#21860)

* Change the .view call to .reshape * Change the .view call to .reshape to all the copies from bart attention * Fix copies and style * Fix copies and style * Fix copies and style * Fix copies and style * Fix copies and style * Revert unneccessary changes * Revert unneccessary changes * Revert unneccessary changes * Revert unneccessary changes

Change the way tensor is reshaped in BartAttention (from .view to .reshape) (#21860)
* Change the .view call to .reshape * Change the .view call to .reshape to all the copies from bart attention * Fix copies and style * Fix copies and style * Fix copies and style * Fix copies and style * Fix copies and style * Revert unneccessary changes * Revert unneccessary changes * Revert unneccessary changes * Revert unneccessary changes
ebd52589 · raghavanone · GitHub · f71873c5 · ebd52589 · ebd52589
Unverified Commit ebd52589 authored Mar 01, 2023 by raghavanone Committed by GitHub Mar 01, 2023
Showing with 4 additions and 4 deletions

src/transformers/models/wav2vec2/modeling_wav2vec2.py src/transformers/models/wav2vec2/modeling_wav2vec2.py +2 -2

src/transformers/models/whisper/modeling_whisper.py src/transformers/models/whisper/modeling_whisper.py +2 -2

No files found.
--- a/src/transformers/models/wav2vec2/modeling_wav2vec2.py
+++ b/src/transformers/models/wav2vec2/modeling_wav2vec2.py
@@ -574,8 +574,8 @@ class Wav2Vec2Attention(nn.Module):

        proj_shape = (bsz * self.num_heads, -1, self.head_dim)
        query_states = self._shape(query_states, tgt_len, bsz).view(*proj_shape)
-        key_states = key_states.view(*proj_shape)
-        value_states = value_states.view(*proj_shape)
+        key_states = key_states.reshape(*proj_shape)
+        value_states = value_states.reshape(*proj_shape)

        src_len = key_states.size(1)
        attn_weights = torch.bmm(query_states, key_states.transpose(1, 2))

--- a/src/transformers/models/whisper/modeling_whisper.py
+++ b/src/transformers/models/whisper/modeling_whisper.py
@@ -319,8 +319,8 @@ class WhisperAttention(nn.Module):

        proj_shape = (bsz * self.num_heads, -1, self.head_dim)
        query_states = self._shape(query_states, tgt_len, bsz).view(*proj_shape)
-        key_states = key_states.view(*proj_shape)
-        value_states = value_states.view(*proj_shape)
+        key_states = key_states.reshape(*proj_shape)
+        value_states = value_states.reshape(*proj_shape)

        src_len = key_states.size(1)
        attn_weights = torch.bmm(query_states, key_states.transpose(1, 2))