"llm/git@developer.sourcefind.cn:OpenDAS/ollama.git" did not exist on "b85982eb9138a36d6b17f7fa2b555dfd92da8738"
Commit 0d3304d7 authored by myhloli's avatar myhloli
Browse files

perf(inference): adjust batch ratio for GPU memory sizes

- Simplify batch ratio logic for GPU memory >= 16GB
- Remove unnecessary conditions for 20GB and 40GB memory
parent 59fc80d4
...@@ -170,11 +170,7 @@ def doc_analyze( ...@@ -170,11 +170,7 @@ def doc_analyze(
gpu_memory = int(os.getenv("VIRTUAL_VRAM_SIZE", round(get_vram(device)))) gpu_memory = int(os.getenv("VIRTUAL_VRAM_SIZE", round(get_vram(device))))
if gpu_memory is not None and gpu_memory >= 8: if gpu_memory is not None and gpu_memory >= 8:
if gpu_memory >= 40: if gpu_memory >= 16:
batch_ratio = 32
elif gpu_memory >=20:
batch_ratio = 16
elif gpu_memory >= 16:
batch_ratio = 8 batch_ratio = 8
elif gpu_memory >= 10: elif gpu_memory >= 10:
batch_ratio = 4 batch_ratio = 4
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment