[deepspeed / m2m_100] make deepspeed zero-3 work with layerdrop (#16717)
* [deepspeed / m2m_100] make deepspeed 3 work with layerdrop * fix * revert last
Showing
Please register or sign in to comment
* [deepspeed / m2m_100] make deepspeed 3 work with layerdrop * fix * revert last