Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ollama
Commits
d8fdbfd8
Commit
d8fdbfd8
authored
Mar 21, 2024
by
Daniel Hiltgen
Browse files
Add docs for GPU selection and nvidia uvm workaround
parent
a5ba0fcf
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
28 additions
and
10 deletions
+28
-10
docs/faq.md
docs/faq.md
+0
-10
docs/gpu.md
docs/gpu.md
+28
-0
No files found.
docs/faq.md
View file @
d8fdbfd8
...
...
@@ -228,13 +228,3 @@ To unload the model and free up memory use:
```
shell
curl http://localhost:11434/api/generate
-d
'{"model": "llama2", "keep_alive": 0}'
```
## Controlling which GPUs to use
By default, on Linux and Windows, Ollama will attempt to use Nvidia GPUs, or
Radeon GPUs, and will use all the GPUs it can find. You can limit which GPUs
will be utilized by setting the environment variable
`CUDA_VISIBLE_DEVICES`
for
NVIDIA cards, or
`HIP_VISIBLE_DEVICES`
for Radeon GPUs to a comma delimited list
of GPU IDs. You can see the list of devices with GPU tools such as
`nvidia-smi`
or
`rocminfo`
. You can set to an invalid GPU ID (e.g., "-1") to bypass the GPU and
fallback to CPU.
docs/gpu.md
View file @
d8fdbfd8
...
...
@@ -29,6 +29,21 @@ Check your compute compatibility to see if your card is supported:
| | Quadro |
`K2200`
`K1200`
`K620`
`M1200`
`M520`
`M5000M`
`M4000M`
`M3000M`
`M2000M`
`M1000M`
`K620M`
`M600M`
`M500M`
|
### GPU Selection
If you have multiple NVIDIA GPUs in your system and want to limit Ollama to use
a subset, you can set
`CUDA_VISIBLE_DEVICES`
to a comma separated list of GPUs.
Numeric IDs may be used, however ordering may vary, so UUIDs are more reliable.
You can discover the UUID of your GPUs by running
`nvidia-smi -L`
If you want to
ignore the GPUs and force CPU usage, use an invalid GPU ID (e.g., "-1")
### Laptop Suspend Resume
On linux, after a suspend/resume cycle, sometimes Ollama will fail to discover
your NVIDIA GPU, and fallback to running on the CPU. You can workaround this
driver bug by reloading the NVIDIA UVM driver with
`sudo rmmod nvidia_uvm &&
sudo modprobe nvidia_uvm`
## AMD Radeon
Ollama supports the following AMD GPUs:
| Family | Cards and accelerators |
...
...
@@ -70,5 +85,18 @@ future release which should increase support for more GPUs.
Reach out on
[
Discord
](
https://discord.gg/ollama
)
or file an
[
issue
](
https://github.com/ollama/ollama/issues
)
for additional help.
### GPU Selection
If you have multiple AMD GPUs in your system and want to limit Ollama to use a
subset, you can set
`HIP_VISIBLE_DEVICES`
to a comma separated list of GPUs.
You can see the list of devices with
`rocminfo`
. If you want to ignore the GPUs
and force CPU usage, use an invalid GPU ID (e.g., "-1")
### Container Permission
In some Linux distributions, SELinux can prevent containers from
accessing the AMD GPU devices. On the host system you can run
`sudo setsebool container_use_devices=1`
to allow containers to use devices.
### Metal (Apple GPUs)
Ollama supports GPU acceleration on Apple devices via the Metal API.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment