Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ollama
Commits
05d53457
Unverified
Commit
05d53457
authored
Sep 16, 2025
by
russcoss
Committed by
GitHub
Sep 16, 2025
Browse files
refactor: use the built-in max/min to simplify the code (#12280)
Signed-off-by:
russcoss
<
russcoss@outlook.com
>
parent
b225508c
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
4 additions
and
20 deletions
+4
-20
runner/llamarunner/cache.go
runner/llamarunner/cache.go
+1
-6
runner/ollamarunner/cache.go
runner/ollamarunner/cache.go
+1
-6
server/internal/internal/backoff/backoff.go
server/internal/internal/backoff/backoff.go
+1
-4
server/sched.go
server/sched.go
+1
-4
No files found.
runner/llamarunner/cache.go
View file @
05d53457
...
...
@@ -204,13 +204,8 @@ func (c *InputCache) ShiftDiscard(inputLen int, numKeep int) int {
targetFree
=
max
(
targetFree
,
1
)
currentFree
:=
c
.
numCtx
-
inputLen
discard
:=
targetFree
-
currentFree
if
discard
<
0
{
discard
=
0
}
return
discard
return
max
(
targetFree
-
currentFree
,
0
)
}
type
ErrReprocessInputs
struct
{
...
...
runner/ollamarunner/cache.go
View file @
05d53457
...
...
@@ -242,13 +242,8 @@ func (c *InputCache) ShiftDiscard(inputLen int32, numKeep int32) int32 {
targetFree
=
max
(
targetFree
,
1
)
currentFree
:=
c
.
numCtx
-
inputLen
discard
:=
targetFree
-
currentFree
if
discard
<
0
{
discard
=
0
}
return
discard
return
max
(
targetFree
-
currentFree
,
0
)
}
type
ErrReprocessInputs
struct
{
...
...
server/internal/internal/backoff/backoff.go
View file @
05d53457
...
...
@@ -25,10 +25,7 @@ func Loop(ctx context.Context, maxBackoff time.Duration) iter.Seq2[int, error] {
// n^2 backoff timer is a little smoother than the
// common choice of 2^n.
d
:=
time
.
Duration
(
n
*
n
)
*
10
*
time
.
Millisecond
if
d
>
maxBackoff
{
d
=
maxBackoff
}
d
:=
min
(
time
.
Duration
(
n
*
n
)
*
10
*
time
.
Millisecond
,
maxBackoff
)
// Randomize the delay between 0.5-1.5 x msec, in order
// to prevent accidental "thundering herd" problems.
d
=
time
.
Duration
(
float64
(
d
)
*
(
rand
.
Float64
()
+
0.5
))
...
...
server/sched.go
View file @
05d53457
...
...
@@ -382,10 +382,7 @@ func (pending *LlmRequest) useLoadedRunner(runner *runnerRef, finished chan *Llm
// load creates a new model based on req and loads it. If requireFull is true then the model must be loaded fully onto GPUs
// (if any). Returns whether the scheduler needs to evict a model to make this one fit.
func
(
s
*
Scheduler
)
load
(
req
*
LlmRequest
,
f
*
ggml
.
GGML
,
gpus
discover
.
GpuInfoList
,
requireFull
bool
)
bool
{
numParallel
:=
int
(
envconfig
.
NumParallel
())
if
numParallel
<
1
{
numParallel
=
1
}
numParallel
:=
max
(
int
(
envconfig
.
NumParallel
()),
1
)
// Embedding models should always be loaded with parallel=1
if
req
.
model
.
CheckCapabilities
(
model
.
CapabilityCompletion
)
!=
nil
{
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment