Native Torchscript Wordpiece Tokenizer Op for BERTSquadQA, Torchscriptify BertSQUADQAModel (#879)

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/879 Pull Request resolved: https://github.com/facebookresearch/pytext/pull/1023 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1211 Added a new native op that does wordpiece tokenization while additionally returning token start and end indices in the raw text as required by BertSquadQA. Includes Unit Tests for the native op and also to check its parity with the PyText Wordpiece Tokenizer. Also combined is a torchscript implementation of the Bert SQUAD QA Model. There are scripts for evaluation and testing of the torchscript code as well. Reviewed By: borguz, hikushalhere Differential Revision: D17455985 fbshipit-source-id: c2617c7ecbce0f733b31d04558da965d0b62637b

Native Torchscript Wordpiece Tokenizer Op for BERTSquadQA, Torchscriptify BertSQUADQAModel (#879)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/879 Pull Request resolved: https://github.com/facebookresearch/pytext/pull/1023 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1211 Added a new native op that does wordpiece tokenization while additionally returning token start and end indices in the raw text as required by BertSquadQA. Includes Unit Tests for the native op and also to check its parity with the PyText Wordpiece Tokenizer. Also combined is a torchscript implementation of the Bert SQUAD QA Model. There are scripts for evaluation and testing of the torchscript code as well. Reviewed By: borguz, hikushalhere Differential Revision: D17455985 fbshipit-source-id: c2617c7ecbce0f733b31d04558da965d0b62637b
de348d1f · Debojeet Chatterjee · Facebook Github Bot · 58e43cb3 · de348d1f · de348d1f
Commit de348d1f authored Oct 04, 2019 by Debojeet Chatterjee Committed by Facebook Github Bot Oct 04, 2019
Showing with 5 additions and 12 deletions

fairseq/modules/learned_positional_embedding.py fairseq/modules/learned_positional_embedding.py +1 -1

fairseq/modules/multihead_attention.py fairseq/modules/multihead_attention.py +4 -11

No files found.
--- a/fairseq/modules/learned_positional_embedding.py
+++ b/fairseq/modules/learned_positional_embedding.py
@@ -38,7 +38,7 @@ class LearnedPositionalEmbedding(nn.Embedding):
                positions = input.data.new(1, 1).fill_(int(self.padding_idx + input.size(1)))
            else:
                positions = utils.make_positions(
-                    input.data, self.padding_idx, onnx_trace=self.onnx_trace,
+                    input, self.padding_idx, onnx_trace=self.onnx_trace,
                )
        return super().forward(positions)

--- a/fairseq/modules/multihead_attention.py
+++ b/fairseq/modules/multihead_attention.py
@@ -255,17 +255,10 @@ class MultiheadAttention(nn.Module):
        if key_padding_mask is not None:
            # don't attend to padding symbols
            attn_weights = attn_weights.view(bsz, self.num_heads, tgt_len, src_len)
-            if self.onnx_trace:
+            attn_weights = attn_weights.masked_fill(
-                attn_weights = torch.where(
+                key_padding_mask.unsqueeze(1).unsqueeze(2),
-                    key_padding_mask.unsqueeze(1).unsqueeze(2),
+                float('-inf'),
-                    torch.Tensor([float("-Inf")]),
+            )
-                    attn_weights.float()
-                ).type_as(attn_weights)
-            else:
-                attn_weights = attn_weights.masked_fill(
-                    key_padding_mask.unsqueeze(1).unsqueeze(2),
-                    float('-inf'),
-                )
            attn_weights = attn_weights.view(bsz * self.num_heads, tgt_len, src_len)
        if before_softmax: