-
Pedro Cuenca authored
* Fix resuming state when using gradient checkpointing. Also, allow --resume_from_checkpoint to be used when the checkpoint does not yet exist (a normal training run will be started). * style
f4dddaf5
* Fix resuming state when using gradient checkpointing. Also, allow --resume_from_checkpoint to be used when the checkpoint does not yet exist (a normal training run will be started). * style