Commit 2eccdbd2 authored by Michael Carilli's avatar Michael Carilli
Browse files

docstring

parent 0750a757
...@@ -57,11 +57,13 @@ def scale_loss(loss, ...@@ -57,11 +57,13 @@ def scale_loss(loss,
will use the default global loss scaler for this backward pass. will use the default global loss scaler for this backward pass.
model(torch.nn.Module, optional, default=None): Currently unused, reserved to enable future model(torch.nn.Module, optional, default=None): Currently unused, reserved to enable future
optimizations. optimizations.
delay_unscale(bool, optional, default=False): ``delay_unscale`` is never necessary. delay_unscale(bool, optional, default=False): ``delay_unscale`` is never necessary, and
It's a minor ninja performance optimization and can result in weird gotchas (especially the default value of ``False`` is strongly recommended.
with multiple models/optimzers/losses), so only use it if you know what you're doing.
If ``True``, Amp will not unscale the gradients or perform model->master If ``True``, Amp will not unscale the gradients or perform model->master
gradient copies on context manager exit. gradient copies on context manager exit.
``delay_unscale=True`` is a minor ninja performance optimization and can result
in weird gotchas (especially with multiple models/optimizers/losses),
so only use it if you know what you're doing.
"Gradient accumulation across iterations" under `Advanced Amp Usage`_ "Gradient accumulation across iterations" under `Advanced Amp Usage`_
illustrates a situation where this CAN (but does not need to) be used. illustrates a situation where this CAN (but does not need to) be used.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment