ZeRO-1 tune max-elems + bug fix (#532)
* zero-1 memory fix * auto-tune max elems per comm to reduce padding/comm intervals * clean-up and added previously missing reduction options * fix testing backing to work with torch1.7
Showing
Please register or sign in to comment