Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
1cdc8861
Unverified
Commit
1cdc8861
authored
Feb 21, 2025
by
Szymon Ożóg
Committed by
GitHub
Feb 20, 2025
Browse files
Missing comment explaining VDR variable in GGUF kernels (#13290)
parent
31aa045c
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
0 deletions
+2
-0
csrc/quantization/gguf/vecdotq.cuh
csrc/quantization/gguf/vecdotq.cuh
+2
-0
No files found.
csrc/quantization/gguf/vecdotq.cuh
View file @
1cdc8861
...
@@ -37,6 +37,8 @@ static __device__ __forceinline__ int get_int_from_uint8_aligned(const uint8_t *
...
@@ -37,6 +37,8 @@ static __device__ __forceinline__ int get_int_from_uint8_aligned(const uint8_t *
return
*
((
const
int
*
)
(
x8
+
sizeof
(
int
)
*
i32
));
// assume at least 4 byte alignment
return
*
((
const
int
*
)
(
x8
+
sizeof
(
int
)
*
i32
));
// assume at least 4 byte alignment
}
}
// VDR = vec dot ratio, how many contiguous integers each thread processes when the vec dot kernel is called
// MMVQ = mul_mat_vec_q, MMQ = mul_mat_q
#define VDR_Q4_0_Q8_1_MMVQ 2
#define VDR_Q4_0_Q8_1_MMVQ 2
#define VDR_Q4_0_Q8_1_MMQ 4
#define VDR_Q4_0_Q8_1_MMQ 4
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment