"...blobs/bbbcb9f65616524d6199fa3bc16dc0500fb2cbbb" did not exist on "5e84353ebab5e0ce4fc762f64fabbdd9ac0c282a"
Re-order attention head outputs for better perf
Significant performance boost over the original orderings on an already somewhat optimised branch this gave me > 2x end-to-end throughput on a squad xlnet fine-tuning task (batch 8, seq-length 612, fp16)
Showing
Please register or sign in to comment