add warning when using gradient_checkpointing with FSDP full shard (#31578)
* add warning when using with FSDP full shard * fix style * Update src/transformers/training_args.py Co-authored-by:amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add hybrid shard warn * fix style --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Showing
Please register or sign in to comment