Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ollama
Commits
4667452a
Commit
4667452a
authored
Mar 06, 2025
by
xuxzh1
🎱
Browse files
opt quantize_q8_1 kernel
parent
1dc4b857
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
3 deletions
+4
-3
llama/ggml-cuda/quantize.cu
llama/ggml-cuda/quantize.cu
+4
-3
No files found.
llama/ggml-cuda/quantize.cu
View file @
4667452a
...
...
@@ -58,9 +58,10 @@ static __global__ __launch_bounds__(1024) void quantize_q8_1(const float * __res
if
(
iqs
>
0
)
{
return
;
}
reinterpret_cast
<
half
&>
(
y
[
ib
].
ds
.
x
)
=
d
;
reinterpret_cast
<
half
&>
(
y
[
ib
].
ds
.
y
)
=
sum
;
ggml_half2
ds
=
{
d
,
sum
};
y
[
ib
].
ds
=
ds
;
//reinterpret_cast<half&>(y[ib].ds) = ds;
//reinterpret_cast<half&>(y[ib].ds.y) = sum;
}
template
<
mmq_q8_1_ds_layout
ds_layout
>
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment