Commit c964b9ad authored by zhuwenwen's avatar zhuwenwen
Browse files

skip indexer_k_cache

parent ac4f685b
......@@ -682,13 +682,13 @@ def sparse_attn_indexer(
quant_block_size,
scale_fmt,
)
else:
ops.indexer_k_cache(
k,
kv_cache,
slot_mapping,
scale_fmt,
)
# else:
# ops.indexer_k_cache(
# k,
# kv_cache,
# slot_mapping,
# scale_fmt,
# )
topk_indices_buffer[: hidden_states.shape[0]] = -1
if has_prefill:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment