Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ollama
Commits
bd68d3ae
Unverified
Commit
bd68d3ae
authored
May 14, 2025
by
Bruce MacDonald
Committed by
GitHub
May 14, 2025
Browse files
ggml: update qwen25vl vision size estimate (#10711)
parent
ff80718e
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
16 deletions
+6
-16
fs/ggml/ggml.go
fs/ggml/ggml.go
+6
-16
No files found.
fs/ggml/ggml.go
View file @
bd68d3ae
...
@@ -6,7 +6,6 @@ import (
...
@@ -6,7 +6,6 @@ import (
"fmt"
"fmt"
"io"
"io"
"log/slog"
"log/slog"
"math"
"slices"
"slices"
"strings"
"strings"
...
@@ -653,24 +652,15 @@ func (llm GGML) VisionGraphSize() (weights, graphSize uint64) {
...
@@ -653,24 +652,15 @@ func (llm GGML) VisionGraphSize() (weights, graphSize uint64) {
numPatches
*
numPatches
*
headCount
)
numPatches
*
numPatches
*
headCount
)
case
"qwen25vl"
:
case
"qwen25vl"
:
maxPixels
:=
uint64
(
llm
.
KV
()
.
Uint
(
"vision.max_pixels"
,
28
*
28
*
1280
))
maxPixels
:=
uint64
(
llm
.
KV
()
.
Uint
(
"vision.max_pixels"
,
28
*
28
*
1280
))
mergeSize
:=
uint64
(
llm
.
KV
()
.
Uint
(
"vision.spatial_merge_size"
,
2
))
temporalPatchSize
:=
uint64
(
2
)
numPatches
:=
maxPixels
/
(
patchSize
*
patchSize
)
// Calculate max possible patches based on max_pixels
maxHeight
:=
uint64
(
math
.
Sqrt
(
float64
(
maxPixels
)))
maxWidth
:=
maxPixels
/
maxHeight
maxGridHeight
:=
maxHeight
/
patchSize
maxGridWidth
:=
maxWidth
/
patchSize
// Account for merged patches (2x2 grid)
numPatches
:=
(
maxGridHeight
*
maxGridWidth
)
/
(
mergeSize
*
mergeSize
)
// Calculate graph size based on typical operations in ProcessImage and createPatches
graphSize
=
4
*
(
maxPixels
*
numChannels
+
// Original image storage
graphSize
=
4
*
(
maxPixels
*
numChannels
+
// Original image storage
// Normalized pixels
// Normalized pixels
maxPixels
*
numChannels
+
maxPixels
*
numChannels
+
// Patches storage (numPatches * channels *
temporalPatchSize *
patchSize^2)
// Patches storage (numPatches * channels * patchSize^2)
numPatches
*
numChannels
*
temporalPatchSize
*
patchSize
*
patchSize
+
numPatches
*
numChannels
*
patchSize
*
patchSize
+
// Self-attention calculations
(similar to other architectures)
// Self-attention calculations
numPatches
*
numPatches
*
headCount
+
numPatches
*
numPatches
*
headCount
+
// Additional buffer for processing
// Additional buffer for processing
embeddingLength
*
numPatches
)
embeddingLength
*
numPatches
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment