Unverified Commit 8453201c authored by Sylvain Gugger's avatar Sylvain Gugger Committed by GitHub
Browse files

Avoid erasing the attention mask when double padding (#8915)

parent 0deece9c
...@@ -3047,9 +3047,8 @@ class PreTrainedTokenizerBase(SpecialTokensMixin): ...@@ -3047,9 +3047,8 @@ class PreTrainedTokenizerBase(SpecialTokensMixin):
encoded_inputs["input_ids"] = [self.pad_token_id] * difference + encoded_inputs["input_ids"] encoded_inputs["input_ids"] = [self.pad_token_id] * difference + encoded_inputs["input_ids"]
else: else:
raise ValueError("Invalid padding strategy:" + str(self.padding_side)) raise ValueError("Invalid padding strategy:" + str(self.padding_side))
else: elif return_attention_mask and "attention_mask" not in encoded_inputs:
if return_attention_mask: encoded_inputs["attention_mask"] = [1] * len(encoded_inputs["input_ids"])
encoded_inputs["attention_mask"] = [1] * len(encoded_inputs["input_ids"])
return encoded_inputs return encoded_inputs
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment