Response to issue #16: more comprehensive instructions in examples/docker

2e7d799f · Michael Carilli · d52edb9e · 2e7d799f · 2e7d799f · 2e7d799f
Commit 2e7d799f authored Jun 24, 2018 by Michael Carilli
Showing with 43 additions and 7 deletions

examples/docker/Dockerfile examples/docker/Dockerfile +5 -5

examples/docker/README.md examples/docker/README.md +13 -1

examples/docker/base_images.md examples/docker/base_images.md +24 -0

setup.py setup.py +1 -1

No files found.
--- a/examples/docker/Dockerfile
+++ b/examples/docker/Dockerfile
-# Base image must at least have nvcc and CUDA installed.
+# Base image must at least have pytorch and CUDA installed.
-FROM gitlab-dl.nvidia.com:5005/dgx/pytorch:18.04-py3-devel
+FROM <base image>
 WORKDIR /workspace
 # uninstall Apex if present
 RUN pip uninstall -y apex || :
-# SHA is something the user can alter to force recreation of this Docker layer, 
+# SHA is something the user can touch to force recreation of this Docker layer, 
-# and therefore force cloning the latest version of Apex
+# and therefore force cloning of the latest version of Apex
-RUN SHA=43f1ae08 git clone https://github.com/NVIDIA/apex.git
+RUN SHA=ToUcHMe git clone https://github.com/NVIDIA/apex.git
 WORKDIR /workspace/apex
 RUN python setup.py install
 WORKDIR /workspace
--- a/examples/docker/README.md
+++ b/examples/docker/README.md
-Example of a minimal Dockerfile that pulls and installs the latest version of Apex.
+*Dockerfile"* is a simple template that shows how to install the latest Apex on top of an existing image.  Edit *Dockerfile* to choose a base image, then run 
+```
+docker build -t image_with_apex ."
+```
+.  If you want to rebuild your image, and force the latest Apex to be cloned and installed, make any small change to the "SHA" variable on line 8.
+*base_images.md* provides guidance on base images to use in the `FROM <base image>` line of *Dockerfile*.
+Instead of building a new container, it is also a viable option to clone Apex on bare metal, mount the Apex repo into your container at launch by running, for example,
+```
+docker run --runtime=nvidia -it --rm --ipc=host -v /bare/metal/apex:/apex/in/container <base image>
+```
+, then go to /apex/in/container within the running container and `python setup.py install`.
--- a/examples/docker/base_images.md
+++ b/examples/docker/base_images.md
+When specifying 
+```
+FROM <base image>
+```
+in *Dockerfile*, `<base image>` must have Pytorch and CUDA installed.
+If you have an NGC account, you can use Nvidia's official Pytorch container
+```
+nvcr.io/nvidia/pytorch:18.04-py3
+```
+as `<base image>`.
+If you don't have an NGC account, you can sign up for one for free by following the instructions [here](https://docs.nvidia.com/ngc/ngc-getting-started-guide/index.html#generating-api-key).
+An alternative is to first 
+[build a local Pytorch image](https://github.com/pytorch/pytorch#docker-image) using Pytorch's Dockerfile on Github. From the root of your cloned Pytorch repo,
+run
+```
+docker build -t my_pytorch_image -f docker/pytorch/Dockerfile .
+```
+`my_pytorch_image` will contain CUDA, and can be used as `<base image>`.
+Warning:
+Currently, Pytorch's latest stable image on Dockerhub
+[pytorch/pytorch:0.4_cuda9_cudnn7](https://hub.docker.com/r/pytorch/pytorch/tags/) contains Pytorch installed with prebuilt binaries.  It does not contain NVCC, which means it is not an eligible candidate for `<base image>`.
--- a/setup.py
+++ b/setup.py
 import torch.cuda
-import ctypes
+import ctypes.util
 import os
 import re
 import subprocess