Fix gradient_checkpointing backward compatibility (#14408)
* Fix gradient_checkpointing backward compatibility
* Remove needless line
* make sure mask prob is big enough and length small enough
* Fix tests
Co-authored-by:
patrickvonplaten <patrick.v.platen@gmail.com>
Showing
Please register or sign in to comment