fix use of mems in Transformer-XL (#4826)

Fixed duplicated memory use in Transformer-XL generation leading to bad predictions and performance.

fix use of mems in Transformer-XL (#4826)
Fixed duplicated memory use in Transformer-XL generation leading to bad predictions and performance.
812def00 · tommccoy · GitHub · 306f1a26 · 812def00
Unverified Commit 812def00 authored Jul 02, 2020 by tommccoy Committed by GitHub Jul 02, 2020
Hide whitespace changes
Inline Side-by-side

Showing with 4 additions and 1 deletion

src/transformers/modeling_transfo_xl.py src/transformers/modeling_transfo_xl.py +4 -1

No files found.
--- a/src/transformers/modeling_transfo_xl.py
+++ b/src/transformers/modeling_transfo_xl.py
@@ -1016,11 +1016,14 @@ class TransfoXLLMHeadModel(TransfoXLPreTrainedModel):
            return self.crit.out_layers[-1]
    def prepare_inputs_for_generation(self, input_ids, past, **model_kwargs):
-        inputs = {"input_ids": input_ids}
+        inputs = {}
        # if past is defined in model kwargs then use it for faster decoding
        if past:
            inputs["mems"] = past
+            inputs["input_ids"] = input_ids[:, -1].unsqueeze(-1)
+        else:
+            inputs["input_ids"] = input_ids
        return inputs