Update README.md (#72)

a9444cd6 · Shengyu Liu · GitHub · c2067be3 · a9444cd6
Unverified Commit a9444cd6 authored Apr 22, 2025 by Shengyu Liu Committed by GitHub Apr 22, 2025
Show whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

README.md README.md +1 -1

No files found.
--- a/README.md
+++ b/README.md
@@ -4,7 +4,7 @@
 We're excited to announce the new release of Flash MLA, which delivers 5% ~ 15% performance improvement on compute-bound workloads, achieving up to 660 TFlops on NVIDIA H800 SXM5 GPUs. The interface of the new version is fully compatible with the old one. Just switch to the new version and enjoy the instant speedup! 🚀🚀🚀
-Besides, we'd love to share the technical details behind the new kernel! Check out our deep-dive write-up here: <LINK>
+Besides, we'd love to share the technical details behind the new kernel! Check out our deep-dive write-up [here](docs/20250422-new-kernel-deep-dive.md).
 The new kernel primarily targets compute-intensive settings (where the number of q heads $\times$ the number of q tokens per request (if MTP is disabled then it's 1) $\ge 64$). For memory-bound cases, we recommend using version [b31bfe7](https://github.com/deepseek-ai/FlashMLA/tree/b31bfe72a83ea205467b3271a5845440a03ed7cb) for optimal performance.