Remove memory fence in NVLink barrier. (#253)
* Remove memory fence in NVLink barrier.
* Move `__syncthread` and fence into barrier.
* Fix bugs
---------
Co-authored-by:
Chenggang Zhao <chenggangz@deepseek.com>
Showing
Please register or sign in to comment