[refactor] apply qk norm in attention processors (#9071)
* apply qk norm in attention processors * revert attention processor * qk-norm in only attention proc 2.0 and fused variant
Showing
Please register or sign in to comment
* apply qk norm in attention processors * revert attention processor * qk-norm in only attention proc 2.0 and fused variant