Unverified Commit b5ccedc0 authored by Min Xu's avatar Min Xu Committed by GitHub
Browse files

add warning to adascale before it is validated (#169)

parent 4247f602
AdaScale SGD AdaScale SGD
============ ============
Note, AdaScale is still experimental. It is being validated. APIs may change
in the future. Use at your own risk.
`AdaScale <https://arxiv.org/pdf/2007.05105.pdf>`_ adaptively scales the learning rate when using larger batch sizes for data-parallel training. Let's suppose that your trainer looks like `AdaScale <https://arxiv.org/pdf/2007.05105.pdf>`_ adaptively scales the learning rate when using larger batch sizes for data-parallel training. Let's suppose that your trainer looks like
.. code-block:: python .. code-block:: python
......
...@@ -32,6 +32,7 @@ ...@@ -32,6 +32,7 @@
# POSSIBILITY OF SUCH DAMAGE. # POSSIBILITY OF SUCH DAMAGE.
import functools import functools
import logging
from typing import Any, Dict, Optional from typing import Any, Dict, Optional
import numpy as np import numpy as np
...@@ -79,6 +80,8 @@ class AdaScale(object): ...@@ -79,6 +80,8 @@ class AdaScale(object):
smoothing: float = 0.999, smoothing: float = 0.999,
patch_optimizer: bool = False, patch_optimizer: bool = False,
): ):
logging.warn("AdaScale is experimental. APIs may change. Use at your own risk.")
self._optimizer = optimizer self._optimizer = optimizer
self._optimizer_step = optimizer.step self._optimizer_step = optimizer.step
self._local_grad_sqr: Optional[torch.Tensor] = None self._local_grad_sqr: Optional[torch.Tensor] = None
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment