Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
b866cdbd
Unverified
Commit
b866cdbd
authored
Dec 23, 2024
by
Dipika Sikka
Committed by
GitHub
Dec 24, 2024
Browse files
[Misc] Add assertion and helpful message for marlin24 compressed models (#11388)
parent
2e726680
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
0 deletions
+4
-0
vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_w4a16_24.py
...compressed_tensors/schemes/compressed_tensors_w4a16_24.py
+4
-0
No files found.
vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_w4a16_24.py
View file @
b866cdbd
...
...
@@ -61,6 +61,10 @@ class CompressedTensorsW4A16Sparse24(CompressedTensorsScheme):
params_dtype
:
torch
.
dtype
,
weight_loader
:
Callable
,
**
kwargs
):
assert
params_dtype
==
torch
.
float16
,
(
"float16 is required for marlin24 compressd models. Set dtype=torch.float16"
# noqa: E501
)
pack_factor
=
32
//
self
.
quant_type
.
size_bits
output_size_per_partition
=
sum
(
output_partition_sizes
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment