Merge pull request #1227 from kvcache-ai/change-yaml

change inject yaml

Merge pull request #1227 from kvcache-ai/change-yaml
change inject yaml
8ba7e5d4 · wang jiahao · GitHub · 2a224b25 · 48dfbc8f · 8ba7e5d4
Unverified Commit 8ba7e5d4 authored Apr 29, 2025 by wang jiahao Committed by GitHub Apr 29, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve.yaml ...rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve.yaml +1 -1

No files found.
--- a/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve.yaml
+++ b/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-fp8-linear-ggml-experts-serve.yaml
@@ -44,7 +44,7 @@
 - match:
    name: "^model\\.layers\\..*\\.self_attn$"
  replace:
-    class: ktransformers.operators.attention.flashinfer_attn # optimized MLA implementation
+    class: ktransformers.operators.balance_serve_attention.flashinfer_attn # optimized MLA implementation
    kwargs:
      generate_device: "cuda"
      prefill_device: "cuda"