Unverified Commit ee8a2951 authored by vllmellm's avatar vllmellm Committed by GitHub
Browse files

[Bugfix] Fix compressed-tensors quantization failure for DeepSeek-R1 on MI300x (#36247)


Signed-off-by: default avatarvllmellm <vllm.ellm@embeddedllm.com>
parent 755356b3
......@@ -756,7 +756,7 @@ direct_register_custom_op(
)
class DeepSeekV2FusedQkvAProj(MergedColumnParallelLinear):
class DeepSeekV2FusedQkvAProjLinear(MergedColumnParallelLinear):
def __init__(
self,
input_size: int,
......@@ -848,7 +848,7 @@ class DeepseekV2MLAAttention(nn.Module):
self.max_position_embeddings = max_position_embeddings
if self.q_lora_rank is not None:
self.fused_qkv_a_proj = DeepSeekV2FusedQkvAProj(
self.fused_qkv_a_proj = DeepSeekV2FusedQkvAProjLinear(
self.hidden_size,
[self.q_lora_rank, self.kv_lora_rank + self.qk_rope_head_dim],
quant_config=quant_config,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment