"vscode:/vscode.git/clone" did not exist on "c0239e09e634aac57a111717c25461f1e950cb3e"
[XLNet] Changed post-processing of attention w.r.t to target_mapping
Whenever target_mapping is provided to the input, XLNet outputs two different attention streams. Based on that the attention output would be on of the two: - a list of tensors (usual case for most transformers) - a list of 2-tuples of tensors, one tesor for each of attention streams Docs and unit-tests have been updated
Showing
Please register or sign in to comment