"git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "fd10d79b55d159d845a30adb238cd7019965aa23"
[XLNet] Changed post-processing of attention w.r.t to target_mapping
Whenever target_mapping is provided to the input, XLNet outputs two different attention streams. Based on that the attention output would be on of the two: - a list of tensors (usual case for most transformers) - a list of 2-tuples of tensors, one tesor for each of attention streams Docs and unit-tests have been updated
Showing
Please register or sign in to comment