"llm/git@developer.sourcefind.cn:OpenDAS/ollama.git" did not exist on "ce0dc33cb809405fda18a8077da4058d1f7a5374"
Grouped batched attention + permute (#412)
* grouped attn without batch validates; now move toward grouped batched attn * grouped batched attention * working * remove debug logging clean up clean up * reintroduce g_ prefix back to host tensor variables * format * rename file * restore old file * rename * consolidate padded/non-padded attention example * harmonize padding specialization in attn examples
Showing
Please register or sign in to comment