Added Dockerfile example, more readme updates

0f703d13 · Michael Carilli · 942174bf · 0f703d13 · 0f703d13 · 0f703d13
Commit 0f703d13 authored Jun 14, 2018 by Michael Carilli
7 changed files
--- a/README.md
+++ b/README.md
@@ -6,25 +6,29 @@ Some of the code here will be included in upstream Pytorch eventually.
 The intention of Apex is to make up-to-date utilities available to 
 users as quickly as possible.
-# [Full Documentation](https://nvidia.github.io/apex)
+## Full API Documentation: [https://nvidia.github.io/apex](https://nvidia.github.io/apex)
 # Contents
 ## 1. Mixed Precision 
-[amp:  Automatic Mixed Precision](https://github.com/NVIDIA/apex/tree/master/apex/amp)
+### amp:  Automatic Mixed Precision
 `apex.amp` is a tool designed for ease of use and maximum safety in FP16 training.  All potentially unsafe ops are performed in FP32 under the hood, while safe ops are performed using faster, Tensor Core-friendly FP16 math.  `amp` also automatically implements dynamic loss scaling. 
 The intention of `amp` is to be the "on-ramp" to easy FP16 training: achieve all the numerical stability of full FP32 training, with most of the performance benefits of full FP16 training.
-[FP16_Optimizer](https://github.com/NVIDIA/apex/tree/master/apex/fp16_utils)
+[Python Source and API Documentation](https://github.com/NVIDIA/apex/tree/master/apex/amp)
+### FP16_Optimizer
 `apex.FP16_Optimizer` wraps an existing Python optimizer and automatically implements master parameters and static or dynamic loss scaling under the hood.
 The intention of `FP16_Optimizer` is to be the "highway" for FP16 training: achieve most of the numerically stability of full FP32 training, and almost all the performance benefits of full FP16 training.
-### Examples:
+[Python Source](https://github.com/NVIDIA/apex/tree/master/apex/fp16_utils)
+[API Documentation](https://nvidia.github.io/apex/fp16_utils.html#automatic-management-of-master-params-loss-scaling)
 [Simple examples with FP16_Optimizer](https://github.com/NVIDIA/apex/tree/master/examples/FP16_Optimizer_simple)
@@ -43,9 +47,11 @@ optimized for NVIDIA's NCCL communication library.
 `apex.parallel.multiproc` is a launch utility that helps set up arguments for `DistributedDataParallel.`
-### [Example/Walkthrough](https://github.com/csarofeen/examples/tree/apex/distributed).
+[API Documentation](https://nvidia.github.io/apex/parallel.html)
+[Python Source](https://nvidia.github.io/apex/parallel)
-### [Python Source](https://nvidia.github.io/apex/parallel).
+[Example/Walkthrough](https://github.com/NVIDIA/apex/tree/master/examples/distributed)
 # Requirements

--- a/apex/parallel/README.md
+++ b/apex/parallel/README.md
@@ -7,6 +7,6 @@ multiproc.py contains the source code for `apex.parallel.multiproc`, a launch ut
 ### [API Documentation](https://nvidia.github.io/apex/parallel.html)
-### [Example/Walkthrough](https://github.com/csarofeen/examples/tree/apex/distributed)
+### [Example/Walkthrough](https://github.com/NVIDIA/apex/tree/master/examples/distributed)
--- a/examples/README.md
+++ b/examples/README.md
+## Contents:
+distributed:  Walkthrough of apex distributed data parallel utilities.
+FP16_Optimizer_simple:  Simple examples demonstrating various use cases of `FP16_Optimizer` to automatically manage master parameters and static or dynamic loss scaling.
+imagenet:  Example based on [https://github.com/pytorch/examples/tree/master/imagenet](https://github.com/pytorch/examples/tree/master/imagenet) showing the use of `FP16_Optimizer`, as well as manual management of master parameters and loss scaling for illustration/comparison.
+word_language_model:  Example based on [https://github.com/pytorch/examples/tree/master/word_language_model](https://github.com/pytorch/examples/tree/master/word_language_model) showing the use of `FP16_Optimizer`, as well as manual management of master parameters and loss scaling for illustration/comparison.
+docker:  Example of a minimal Dockerfile that installs Apex on top of the Pytorch 0.4 container.
--- a/examples/docker/Dockerfile
+++ b/examples/docker/Dockerfile
+FROM pytorch/pytorch:0.4_cuda9_cudnn7
+WORKDIR /workspace
+# uninstall Apex if present
+RUN pip uninstall -y apex || :
+# SHA is something the user can alter to force recreation of this Docker layer, 
+# and therefore force cloning the latest version of Apex
+RUN SHA=43d1ae08 git clone https://github.com/NVIDIA/apex.git
+WORKDIR /workspace/apex
+RUN python setup.py install
+WORKDIR /workspace
--- a/examples/docker/README.md
+++ b/examples/docker/README.md
+Example of a minimal Dockerfile that installs Apex on top of upstream Pytorch's stable 0.4 container (pytorch/pytorch:0.4_cuda9_cudnn7).
--- a/examples/imagenet/README.md
+++ b/examples/imagenet/README.md
 # ImageNet training in PyTorch
-This example is based on https://github.com/pytorch/examples/tree/master/imagenet.
+This example is based on [https://github.com/pytorch/examples/tree/master/imagenet](https://github.com/pytorch/examples/tree/master/imagenet).
 It implements training of popular model architectures, such as ResNet, AlexNet, and VGG on the ImageNet dataset.
 `main.py` and `main_fp16_optimizer.py` have been modified to use the `DistributedDataParallel` module in APEx instead of the one in upstream PyTorch.  For description of how this works please see the distributed example included in this repo.

--- a/examples/word_language_model/README.md
+++ b/examples/word_language_model/README.md
 # Word-level language modeling RNN
-This example is based on https://github.com/pytorch/examples/tree/master/imagenet.
+This example is based on [https://github.com/pytorch/examples/tree/master/word_language_model](https://github.com/pytorch/examples/tree/master/word_language_model).
 It trains a multi-layer RNN (Elman, GRU, or LSTM) on a language modeling task.
 By default, the training script uses the Wikitext-2 dataset, provided.
 The trained model can then be used by the generate script to generate new text.