"git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "dee8af4e46d1b0177443cc842c6742f88910afeb"
Revert "Error (also in original) model, scaling only q matrix not qk.T dot...
Revert "Error (also in original) model, scaling only q matrix not qk.T dot product (qk.T/sqrt(dim_per_head))" (#22444) Revert "Error (also in original) model, scaling only q matrix not qk.T dot product (qk.T/sqrt(dim_per_head)) (#21627)" This reverts commit bad83008.
Showing
Please register or sign in to comment