Commit 3c724416 authored by Igor Fedorov's avatar Igor Fedorov Committed by Facebook GitHub Bot
Browse files

Enable training for fraction of total steps; enable early stopping from trial 0

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/627

Enable training for fraction of total steps: when doing HPO, users may want to train for a fraction of the number of training steps of a regular (baseline) training run. In this case, it is not enough to just change SOLVER.MAX_ITER because that also changes the learning rate schedule. We introduce a multiplier to be used on top of SOLVER.MAX_ITER when deciding how many steps to train for. This multiplier does not scale the number of steps over which the learning rate schedule is defined.

Reviewed By: raghuramank100

Differential Revision: D48699087

fbshipit-source-id: 903f7c957ee471f36365c1449e9cd6a919fd260a
parent 54d9d91b
......@@ -572,7 +572,18 @@ class Detectron2GoRunner(D2GoDataAPIMixIn, BaseRunner):
# The checkpoint stores the training iteration that just finished, thus we start
# at the next iteration (or iter zero if there's no checkpoint).
start_iter += 1
max_iter = cfg.SOLVER.MAX_ITER
if "EARLY_STOPPING_FRACTION" in cfg.SOLVER:
assert (
cfg.SOLVER.EARLY_STOPPING_FRACTION >= 0
), f"Early stopping fraction must be non-negative, but is {cfg.SOLVER.EARLY_STOPPING_FRACTION}"
assert (
cfg.SOLVER.EARLY_STOPPING_FRACTION <= 1
), f"Early stopping fraction must not be larger than 1, but is {cfg.SOLVER.EARLY_STOPPING_FRACTION}"
max_iter = int(cfg.SOLVER.MAX_ITER * cfg.SOLVER.EARLY_STOPPING_FRACTION)
else:
max_iter = cfg.SOLVER.MAX_ITER
periodic_checkpointer = PeriodicCheckpointer(
checkpointer, cfg.SOLVER.CHECKPOINT_PERIOD, max_iter=max_iter
)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment