"csrc/vscode:/vscode.git/clone" did not exist on "76bb5d10cee439a8c6ca3ae5f53463c955cd8822"
-
Chenggang Zhao authored
* Remove redundant TMA flushes * Less barrier initialization overhead * Simplify `elect_one_sync` * Use `elect_one_sync` instead of lanes * Minor fix * Polish testing prints * Refactor for internode kernels * Better performance
2012e310