Unverified Commit b5853f99 authored by Hongxia Yang's avatar Hongxia Yang Committed by GitHub
Browse files

[ROCm][AMD][Bugfix] adding a missing triton autotune config (#4845)

parent f09edd8a
...@@ -239,6 +239,16 @@ def _attn_fwd_inner( ...@@ -239,6 +239,16 @@ def _attn_fwd_inner(
num_stages=1, num_stages=1,
num_warps=8, num_warps=8,
), ),
triton.Config(
{
"BLOCK_M": 128,
"BLOCK_N": 64,
"waves_per_eu": 1,
"PRE_LOAD_V": False,
},
num_stages=1,
num_warps=4,
),
triton.Config( triton.Config(
{ {
"BLOCK_M": 128, "BLOCK_M": 128,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment