llm: Don't try to load split vision models in the Ollama engine

If a model with a split vision projector is loaded in the Ollama engine, the projector will be ignored and the model will hallucinate a response. Instead, fallback and try to load the model in the llama engine.

llm: Don't try to load split vision models in the Ollama engine
If a model with a split vision projector is loaded in the Ollama engine, the projector will be ignored and the model will hallucinate a response. Instead, fallback and try to load the model in the llama engine.
aba15753 · Jesse Gross · Jesse Gross · eb10390d · aba15753
Commit aba15753 authored Sep 10, 2025 by Jesse Gross Committed by Jesse Gross Sep 11, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 5 additions and 1 deletion

llm/server.go llm/server.go +5 -1

No files found.
--- a/llm/server.go
+++ b/llm/server.go
@@ -149,7 +149,11 @@ func NewLlamaServer(gpus discover.GpuInfoList, modelPath string, f *ggml.GGML, a
 	var textProcessor model.TextProcessor
 	var err error
 	if envconfig.NewEngine() || f.KV().OllamaEngineRequired() {
-		textProcessor, err = model.NewTextProcessor(modelPath)
+		if len(projectors) == 0 {
+			textProcessor, err = model.NewTextProcessor(modelPath)
+		} else {
+			err = errors.New("split vision models aren't supported")
+		}
 		if err != nil {
 			// To prepare for opt-out mode, instead of treating this as an error, we fallback to the old runner
 			slog.Debug("model not yet supported by Ollama engine, switching to compatibility mode", "model", modelPath, "error", err)