Merge branch 'master' of https://github.com/NVIDIA/apex

9ce80178 · Michael Carilli · f8557569 · f17cd953 · 9ce80178 · 9ce80178
Commit 9ce80178 authored Jun 24, 2019 by Michael Carilli
Show whitespace changes
Inline Side-by-side

Showing with 7 additions and 2 deletions

README.md README.md +2 -2

docs/source/advanced.rst docs/source/advanced.rst +5 -0

No files found.
--- a/README.md
+++ b/README.md
@@ -80,12 +80,12 @@ CUDA and C++ extensions via
 ```
 $ git clone https://github.com/NVIDIA/apex
 $ cd apex
-$ pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .
+$ pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
 ```
 Apex also supports a Python-only build (required with Pytorch 0.4) via
 ```
-$ pip install -v --no-cache-dir .
+$ pip install -v --no-cache-dir ./
 ```
 A Python-only build omits:
 - Fused kernels required to use `apex.optimizers.FusedAdam`.

--- a/docs/source/advanced.rst
+++ b/docs/source/advanced.rst
@@ -145,6 +145,11 @@ Gradient accumulation across iterations
 The following should "just work," and properly accommodate multiple models/optimizers/losses, as well as
 gradient clipping via the `instructions above`_::
+    # If your intent is to simulate a larger batch size using gradient accumulation,
+    # you can divide the loss by the number of accumulation iterations (so that gradients
+    # will be averaged over that many iterations):
+    loss = loss/iters_to_accumulate
    if iter%iters_to_accumulate == 0:
        # Every iters_to_accumulate iterations, unscale and step
        with amp.scale_loss(loss, optimizer) as scaled_loss: