-
Burc Eryilmaz authored
Fuses dropout and softmax in backward pass, add bias support to CPP MHA, add additive mask support, separate Q/K/V parameters (#854) Co-authored-by:Sukru Eryilmaz <seryilmaz@computelab-dgx1v-32.nvidia.com>
6c2babf9
Fuses dropout and softmax in backward pass, add bias support to CPP MHA, add additive mask support, separate Q/K/V parameters (#854)
Co-authored-by:
Sukru Eryilmaz <seryilmaz@computelab-dgx1v-32.nvidia.com>