Doc container usage and workaround for nvidia errors

8cc0ee2e · Daniel Hiltgen · d5eec16d · 8cc0ee2e · 8cc0ee2e · 8cc0ee2e
Commit 8cc0ee2e authored May 09, 2024 by Daniel Hiltgen
Hide whitespace changes
Inline Side-by-side

Showing with 92 additions and 2 deletions

docs/README.md docs/README.md +1 -1

docs/docker.md docs/docker.md +71 -0

docs/troubleshooting.md docs/troubleshooting.md +20 -1

No files found.
--- a/docs/README.md
+++ b/docs/README.md
@@ -6,7 +6,7 @@
 * [Importing models](./import.md)
 * [Linux Documentation](./linux.md)
 * [Windows Documentation](./windows.md)
-* [Docker Documentation](https://hub.docker.com/r/ollama/ollama)
+* [Docker Documentation](./docker.md)
 ### Reference

--- a/docs/docker.md
+++ b/docs/docker.md
+# Ollama Docker image
+### CPU only
+```bash
+docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
+```
+### Nvidia GPU
+Install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installation).
+#### Install with Apt
+1.  Configure the repository
+```bash
+curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
+    | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
+curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
+    | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
+    | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
+sudo apt-get update
+```
+2.  Install the NVIDIA Container Toolkit packages
+```bash
+sudo apt-get install -y nvidia-container-toolkit
+```
+#### Install with Yum or Dnf
+1.  Configure the repository
+```bash
+curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo \
+    | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
+```
+2. Install the NVIDIA Container Toolkit packages
+```bash
+sudo yum install -y nvidia-container-toolkit
+```
+#### Configure Docker to use Nvidia driver 
+```
+sudo nvidia-ctk runtime configure --runtime=docker
+sudo systemctl restart docker
+```
+#### Start the container
+```bash
+docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
+```
+### AMD GPU
+To run Ollama using Docker with AMD GPUs, use the `rocm` tag and the following command:
+```
+docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm
+```
+### Run model locally
+Now you can run a model:
+```
+docker exec -it ollama ollama run llama3
+```
+### Try different models
+More models can be found on the [Ollama library](https://ollama.com/library).
--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@@ -82,4 +82,23 @@ curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION="0.1.29" sh
 If your system is configured with the "noexec" flag where Ollama stores its
 temporary executable files, you can specify an alternate location by setting
 OLLAMA_TMPDIR to a location writable by the user ollama runs as.  For example
 OLLAMA_TMPDIR=/usr/share/ollama/
\ No newline at end of file
+## Container fails to run on NVIDIA GPU
+Make sure you've set up the conatiner runtime first as described in [docker.md](./docker.md)
+Sometimes the container runtime can have difficulties initializing the GPU.
+When you check the server logs, this can show up as various error codes, such
+as "3" (not initialized), "46" (device unavailable), "100" (no device), "999"
+(unknown), or others.  The following troubleshooting techniques may help resolve
+the problem
+- Is the uvm driver not loaded? `sudo nvidia-modprobe -u`
+- Try reloading the nvidia_uvm driver - `sudo rmmod nvidia_uvm` then `sudo modprobe nvidia_uvm`
+- Try rebooting
+- Make sure you're running the latest nvidia drivers
+If none of those resolve the problem, gather additional information and file an issue:
+- Set `CUDA_ERROR_LEVEL=50` and try again to get more diagnostic logs
+- Check dmesg for any errors `sudo dmesg | grep -i nvrm` and `sudo dmesg | grep -i nvidia`