Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ollama
Commits
75d17fc6
"vscode:/vscode.git/clone" did not exist on "98baf41ac44ec0470d85d4441fb739fa6ea18e1c"
Unverified
Commit
75d17fc6
authored
Oct 15, 2025
by
Daniel Hiltgen
Committed by
GitHub
Oct 15, 2025
Browse files
perf: backport cuda iGPU sched spin (#12641)
parent
8fafc8af
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
58 additions
and
0 deletions
+58
-0
llama/patches/0030-CUDA-Changing-the-CUDA-scheduling-strategy-to-spin-1.patch
...UDA-Changing-the-CUDA-scheduling-strategy-to-spin-1.patch
+49
-0
ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu
ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu
+9
-0
No files found.
llama/patches/0030-CUDA-Changing-the-CUDA-scheduling-strategy-to-spin-1.patch
0 → 100644
View file @
75d17fc6
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Julius Tischbein <ju.tischbein@gmail.com>
Date: Wed, 15 Oct 2025 13:54:15 +0200
Subject: [PATCH] CUDA: Changing the CUDA scheduling strategy to spin (#16585)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* CUDA set scheduling strategy to spinning for cc121
* Using prop.major and prop.minor, include HIP and MUSA
* Exclude HIP and MUSA
* Remove trailing whitespace
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* Remove empty line
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
---------
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
---
ggml/src/ggml-cuda/ggml-cuda.cu | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/ggml/src/ggml-cuda/ggml-cuda.cu b/ggml/src/ggml-cuda/ggml-cuda.cu
index 6a278b5e9..87941f872 100644
--- a/ggml/src/ggml-cuda/ggml-cuda.cu
+++ b/ggml/src/ggml-cuda/ggml-cuda.cu
@@ -340,6 +340,15 @@
static ggml_cuda_device_info ggml_cuda_init() {
} else if (device_name.substr(0, 21) == "NVIDIA GeForce GTX 16") {
turing_devices_without_mma.push_back({ id, device_name });
}
+
+ // Temporary performance fix:
+ // Setting device scheduling strategy for iGPUs with cc121 to "spinning" to avoid delays in cuda synchronize calls.
+ // TODO: Check for future drivers the default scheduling strategy and
+ // remove this call again when cudaDeviceScheduleSpin is default.
+ if (prop.major == 12 && prop.minor == 1) {
+ CUDA_CHECK(cudaSetDeviceFlags(cudaDeviceScheduleSpin));
+ }
+
#endif // defined(GGML_USE_HIP)
}
ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu
View file @
75d17fc6
...
...
@@ -340,6 +340,15 @@ static ggml_cuda_device_info ggml_cuda_init() {
}
else
if
(
device_name
.
substr
(
0
,
21
)
==
"NVIDIA GeForce GTX 16"
)
{
turing_devices_without_mma
.
push_back
({
id
,
device_name
});
}
// Temporary performance fix:
// Setting device scheduling strategy for iGPUs with cc121 to "spinning" to avoid delays in cuda synchronize calls.
// TODO: Check for future drivers the default scheduling strategy and
// remove this call again when cudaDeviceScheduleSpin is default.
if
(
prop
.
major
==
12
&&
prop
.
minor
==
1
)
{
CUDA_CHECK
(
cudaSetDeviceFlags
(
cudaDeviceScheduleSpin
));
}
#endif // defined(GGML_USE_HIP)
}
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment