Eliminate 2 gpu ops during sampling when logit_bias is zero (#343)
Co-authored-by:
Qubitium <417764+Qubitium@users.noreply.github.com>
Showing
Please register or sign in to comment
Co-authored-by:
Qubitium <417764+Qubitium@users.noreply.github.com>