runner.go: Better abstract vision model integration
-Update mllama to take the cross attention state as embeddings in
a batch, more similar to how Llava handles it. This improves
integration with the input cache.
-Pass locations in a prompt for embeddings using tags similar to Llava.
-Abstract interface to vision models so the main runner accesses Clip
and Mllama similarly
Co-authored-by:
Michael Yang <mxyng@pm.me>
Showing
llama/runner/image.go
0 → 100644
llama/runner/image_test.go
0 → 100644
Please register or sign in to comment