[`GPTNeo`] Fix gradient checkpointing bug (#21733)
* fix bug
* forward contrib credits from discussions
* change logic
---------
Co-authored-by:
edbeeching <edbeeching@users.noreply.github.com>
Showing
Please register or sign in to comment