Unverified Commit 54c0c857 authored by vcherepanov-nv's avatar vcherepanov-nv Committed by GitHub
Browse files

Bump cuDNN FE to 1.14.0 (#2072)



* Bump cuDNN FE to 1.14.0
Signed-off-by: default avatarVladimir Cherepanov <vcherepanov@nvidia.com>

* Change submodule hash
Signed-off-by: default avatarVladimir Cherepanov <vcherepanov@nvidia.com>

* Pick up a cuDNN FE fix
Signed-off-by: default avatarVladimir Cherepanov <vcherepanov@nvidia.com>

* New model configs in tests
Signed-off-by: default avatarVladimir Cherepanov <vcherepanov@nvidia.com>

* Exclude cuDNN backend for some configs
Signed-off-by: default avatarVladimir Cherepanov <vcherepanov@nvidia.com>

---------
Signed-off-by: default avatarVladimir Cherepanov <vcherepanov@nvidia.com>
parent d770886f
Subproject commit 9793df569ce413f4b1844a9176f7ae24dd981603
Subproject commit deda80e5372d50e925d7bf4f76c5db779be3fbd5
......@@ -274,6 +274,8 @@ model_configs_mla = {
"mla_3_0": ModelConfig(8, 1, 16, 128, max_seqlen_kv=2048, head_dim_v=64), # inference
"mla_3_1": ModelConfig(8, 1, 16, 256, max_seqlen_kv=2048, head_dim_v=128), # inference
"mla_3_2": ModelConfig(8, 1, 16, 192, max_seqlen_kv=2048, head_dim_v=128), # inference
"mla_3_3": ModelConfig(8, 1, 16, 160, max_seqlen_kv=2048, head_dim_v=128), # inference
"mla_3_4": ModelConfig(8, 1, 16, 160, max_seqlen_kv=2048, head_dim_v=160), # inference
}
......
......@@ -252,8 +252,9 @@ NVTE_Fused_Attn_Backend nvte_get_fused_attn_backend(
(head_dim_qk == 192 && head_dim_v == 128 && is_training && sm_arch_ >= 100 &&
cudnn_runtime_version >= 91100)) &&
// 9.11/9.12 bug: 128 < d_qk <= 256, 128 < d_v <= 256 + Hopper + bprop + MLA
(!((cudnn_runtime_version == 91100 || cudnn_runtime_version == 91200) && is_training &&
sm_arch_ == 90 && head_dim_qk >= 128 && head_dim_v >= 128 &&
(!((cudnn_runtime_version == 91100 || cudnn_runtime_version == 91200 ||
cudnn_runtime_version == 91300) &&
is_training && sm_arch_ == 90 && head_dim_qk >= 128 && head_dim_v >= 128 &&
!(head_dim_qk == 192 && head_dim_v == 128) && head_dim_qk != head_dim_v))) &&
// bias type
((cudnn_runtime_version < 8906 && bias_type == NVTE_Bias_Type::NVTE_NO_BIAS) ||
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment