"docs/source/optimization/other.mdx" did not exist on "c3d78cd3067612175ac9f0f8b234abf5a2e1f510"
[refactor] apply qk norm in attention processors (#9071)
* apply qk norm in attention processors * revert attention processor * qk-norm in only attention proc 2.0 and fused variant
Showing
Please register or sign in to comment