-
Lorenzo Verardo authored
This commit adds gate jitter to MixtralSparseMoeBlock's input data before passing it through the MoE layer, if turned on.
a25037be
This commit adds gate jitter to MixtralSparseMoeBlock's input data before passing it through the MoE layer, if turned on.