Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ollama
Commits
58ce2d82
Commit
58ce2d82
authored
Jan 08, 2024
by
Jeffrey Morgan
Browse files
better estimate scratch buffer size
parent
18ddf6d5
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
2 deletions
+2
-2
llm/llm.go
llm/llm.go
+2
-2
No files found.
llm/llm.go
View file @
58ce2d82
...
@@ -62,8 +62,8 @@ func New(workDir, model string, adapters, projectors []string, opts api.Options)
...
@@ -62,8 +62,8 @@ func New(workDir, model string, adapters, projectors []string, opts api.Options)
// this amount is the overhead + tensors in memory
// this amount is the overhead + tensors in memory
// TODO: get this from the llama.cpp's graph calcluations instead of
// TODO: get this from the llama.cpp's graph calcluations instead of
//
guess
ing it's
~
1/
7th of the
kv
cache
times
gqa
//
estimat
ing it's 1/
6 *
kv
_
cache
_size * num_
gqa
requiredAlloc
:=
int64
(
ggml
.
NumGQA
())
*
requiredKv
/
7
requiredAlloc
:=
int64
(
ggml
.
NumGQA
())
*
requiredKv
/
6
requiredTotal
:=
requiredModel
+
requiredKv
+
requiredAlloc
requiredTotal
:=
requiredModel
+
requiredKv
+
requiredAlloc
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment