[Reformer] Axial Pos Emb Improve mem usage reformer (#5209)

* improve mem handling * improve mem for pos ax encodings

[Reformer] Axial Pos Emb Improve mem usage reformer (#5209)
* improve mem handling * improve mem for pos ax encodings
1ae132a0 · Patrick von Platen · GitHub · 51441040 · 1ae132a0
Unverified Commit 1ae132a0 authored Jun 23, 2020 by Patrick von Platen Committed by GitHub Jun 23, 2020
Hide whitespace changes
Inline Side-by-side

Showing with 8 additions and 3 deletions

src/transformers/modeling_reformer.py src/transformers/modeling_reformer.py +8 -3

No files found.
--- a/src/transformers/modeling_reformer.py
+++ b/src/transformers/modeling_reformer.py
@@ -154,9 +154,14 @@ class AxialPositionEmbeddings(nn.Module):
                self.axial_pos_shape, sequence_length, self.least_common_mult_chunk_length,
            )
-            # reshape axial encodings and use only until sequence_length
+            # compute how many columns are needed
-            position_encodings = torch.cat(broadcasted_weights, dim=-1)
+            required_pos_encodings_columns = -(-sequence_length // self.axial_pos_shape[1])
-            position_encodings = position_encodings.view(batch_size, -1, position_encodings.shape[-1])[
+            # cut to columns that are needed
+            position_encodings = torch.cat(
+                [weight[:, :required_pos_encodings_columns] for weight in broadcasted_weights], dim=-1
+            )
+            position_encodings = torch.reshape(position_encodings, (batch_size, -1, position_encodings.shape[-1]))[
                :, :sequence_length
            ]