"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "02c525d2266ae53f73dd857de6a14d73a8aacd66"
[DeepSpeed ZeRO3] Fix performance degradation in sharded models (#18911)
* [DeepSpeed] Fix performance degradation in sharded models
* style
* polish
Co-authored-by:
Stas Bekman <stas@stason.org>
Showing
Please register or sign in to comment