"git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "fd6902838afa35973f4fcc97ec0dcd1de888883e"
MixtralSparseMoeBlock: add gate jitter (#29865)
This commit adds gate jitter to MixtralSparseMoeBlock's input data before passing it through the MoE layer, if turned on.
Showing
Please register or sign in to comment