"git@developer.sourcefind.cn:OpenDAS/dgl.git" did not exist on "6e7f19f27bdf6fddc69aaf8bf2b392d685703f76"
Avoid cast in PositionalEmbeddings to fix BLEU drop in pytorch native export
Summary: Tracing mode doesn't generalize correctly in positional embedding calculation, which caused -5 BLEU at transformer export when using pytorch native. Details: The original issue was that in ensemble_export, _to_tensor(x) in scripting mode turns integer x into 1-d tensor torch.tensor([x]), not 0-d tensor (scalar x) which is expected in the embedding. So the return value in embedding forward() is actually of wrong shape. When self.weights is of size [x,y], the return value should be (bsz, y, 1) but it was (bsz, 1, y), which caused problem in downstream computation. Tracing only becomes an issue when I used pos = timestep.view(-1)[0] to fix the shape. Then casting the scalar to primary int, to be used as index is not generalizable by tracing mode. Thus I need to convert everything to tensor and replace the advanced indexing with index_select operator. In summary, less understood features in both scripting&tracing sides caused the bleu drop. :) Reviewed By: myleott Differential Revision: D16623025 fbshipit-source-id: 0c7a2c3eafbd774760a5c880c6034009ee084abb
Showing
Please register or sign in to comment