"git@developer.sourcefind.cn:OpenDAS/nni.git" did not exist on "59f4c09160ddee7ddc90f447e401d89edf5b858e"
[Dev][Doc] Enhance Flash Attention Implementation in GQA Decoding Example and Fix Typo (#139)
- Add non-split flash attention macro for more flexible kernel generation - Implement `main_no_split` function to handle single-split scenarios - Modify kernel selection logic to dynamically choose between split and non-split implementations
Showing
Please register or sign in to comment