"vscode:/vscode.git/clone" did not exist on "1ea2b7272a23b09987a2dc4cb34bcfd9596301a8"
Commit ad6f6a1d authored by Jesse Gross's avatar Jesse Gross Committed by Jesse Gross
Browse files

llm: Change memory allocation backoff from exponential to incremental

If we create a memory layout that should fit based on report free VRAM
but allocation still fails, we start applying a backoff. This reduces
free VRAM by an exponential percentage (1%, 2%, 4%...). However, the
points chosen tend to be too dense at the beginning and too sparse at
the end. Therefore, this switches to an incremental backoff (10%, 20%,
30%...).
parent 6723a40b
......@@ -766,15 +766,12 @@ nextOperation:
// Memory allocation failed even though we created a layout that we thought should
// fit in available memory. This could happen if either our free memory reports
// are incorrect or if available memory is changing between layout and allocation
// time. Apply an exponential backoff to try to find the real amount of available
// space.
// time. Apply a backoff to try to find the real amount of available space.
if backoff > 1 {
slog.Warn("memory layout cannot be allocated", "memory", resp.Memory)
return nil, errors.New("memory layout cannot be allocated")
} else if backoff == 0 {
backoff = 0.01
} else {
backoff *= 2
backoff += 0.1
}
slog.Info("model layout did not fit, applying backoff", "backoff", fmt.Sprintf("%.2f", backoff))
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment