[shardformer] refactored embedding and dropout to parallel module (#4013)
* [shardformer] refactored embedding and dropout to parallel module * polish code
Showing
This diff is collapsed.
Please register or sign in to comment
* [shardformer] refactored embedding and dropout to parallel module * polish code