Commit c3c85aa0 authored by Jesse Gross's avatar Jesse Gross Committed by Jesse Gross
Browse files

llm: Enable flash attention by default for gemma3

parent 0d713051
......@@ -893,6 +893,7 @@ func (f GGML) SupportsFlashAttention() bool {
// FlashAttention checks if the model should enable flash attention
func (f GGML) FlashAttention() bool {
return slices.Contains([]string{
"gemma3",
"gptoss", "gpt-oss",
"qwen3",
"qwen3moe",
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment