"sgl-kernel/vscode:/vscode.git/clone" did not exist on "6cdcbcc674542e58a441de4e40533bea522180c6"
Disable tp for shared experts under expert parallelism for GLM4.5 model (#8647) (#8647)
Co-authored-by:Stefan He <hebiaobuaa@gmail.com> Co-authored-by:
Cheng Wan <54331508+ch-wan@users.noreply.github.com>
Showing
Please register or sign in to comment