"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "95113d136508dfef192a29d23344e941735d1a1d"
mlp_only_layers is more flexible than decoder_sparse_step (#30552)
* force back to commit ba40a21 and fix workflow errors * match the review suggestions * fix ci errors * fix CI * fix ci, format code * fix ci, ruff format * fix ci, ruff format again * Update src/transformers/models/qwen2_moe/configuration_qwen2_moe.py Co-authored-by:Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/qwen2_moe/configuration_qwen2_moe.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/qwen2_moe/configuration_qwen2_moe.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * solve this warning: Default Argument Value is mutable --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
Showing
Please register or sign in to comment