"vscode:/vscode.git/clone" did not exist on "c51b6bd83749faaa163064668f4d6dd5c9d2aac9"
Fix load balancing loss func for mixtral (#28256)
* Correct the implementation of auxiliary loss of mixtrtal
* correct the implementation of auxiliary loss of mixtrtal
* Implement a simpler calculation method
---------
Co-authored-by:
zhangliangxu3 <zhangliangxu3@jd.com>
Showing
Please register or sign in to comment