• Daniel Hiltgen's avatar
    Remove no longer supported max vram var · cc269ba0
    Daniel Hiltgen authored
    The OLLAMA_MAX_VRAM env var was a temporary workaround for OOM
    scenarios.  With Concurrency this was no longer wired up, and the simplistic
    value doesn't map to multi-GPU setups.  Users can still set `num_gpu`
    to limit memory usage to avoid OOM if we get our predictions wrong.
    cc269ba0
cmd.go 31.3 KB