llama/llava.cpp · c826e57475eb24a68b6d56aa00aeb31136d3e520 · orangecat / ollama

"ggml/src/ggml-vulkan/vulkan-shaders/sum_rows.comp" did not exist on "4cc1a6143387f41e2466536abcd6a2620b63a35b"

runner.go: Better abstract vision model integration · c826e574

Jesse Gross authored Oct 11, 2024



-Update mllama to take the cross attention state as embeddings in
a batch, more similar to how Llava handles it. This improves
integration with the input cache.
-Pass locations in a prompt for embeddings using tags similar to Llava.
-Abstract interface to vision models so the main runner accesses Clip
and Mllama similarly
Co-authored-by: Michael Yang <mxyng@pm.me>

c826e574

llava.cpp 23.3 KB

Replace llava.cpp