perf: prefer batched matmuls for attention (#1203)
perf: prefer batched matmuls for attention. added fast-path to Decoder when num_heads=1
Showing
Please register or sign in to comment
perf: prefer batched matmuls for attention. added fast-path to Decoder when num_heads=1