Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ollama
Commits
539be436
Unverified
Commit
539be436
authored
Dec 04, 2024
by
Sam
Committed by
GitHub
Dec 03, 2024
Browse files
llm: normalise kvct parameter handling (#7926)
parent
1bdab9fd
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
2 additions
and
2 deletions
+2
-2
llm/memory.go
llm/memory.go
+1
-1
llm/server.go
llm/server.go
+1
-1
No files found.
llm/memory.go
View file @
539be436
...
...
@@ -129,7 +129,7 @@ func EstimateGPULayers(gpus []discover.GpuInfo, ggml *GGML, projectors []string,
var
kvct
string
if
fa
{
requested
:=
envconfig
.
KvCacheType
()
requested
:=
strings
.
ToLower
(
envconfig
.
KvCacheType
()
)
if
requested
!=
""
&&
ggml
.
SupportsKVCacheType
(
requested
)
{
kvct
=
requested
}
...
...
llm/server.go
View file @
539be436
...
...
@@ -225,7 +225,7 @@ func NewLlamaServer(gpus discover.GpuInfoList, model string, ggml *GGML, adapter
fa
=
false
}
kvct
:=
envconfig
.
KvCacheType
()
kvct
:=
strings
.
ToLower
(
envconfig
.
KvCacheType
()
)
if
fa
{
slog
.
Info
(
"enabling flash attention"
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment