"example/vscode:/vscode.git/clone" did not exist on "44be91d3d15b485aed091f920e863545a8765489"
Fix uninitialized parameter in conformer relative attention. (#18368)
`torch.Tensor` creates an unitialized tensor (as via `torch.empty`), this leads to undeterministic behavior, poor initialization, and nans if you have unlucky init. The paper does not specify the initialization for bias terms, so I guess zero seems like a good choice - no bias initially. `torch.Tensor` is usually populated with zeros, so this fix will be close to the intended behavior: ``` >>> torch.Tensor(100, 100).sum() tensor(0.) >>> torch.Tensor(100, 100).sum() tensor(nan) >>> torch.Tensor(100, 100).sum() tensor(0.) ```
Showing
Please register or sign in to comment