"docs/ssh:/git@developer.sourcefind.cn:2222/tsoc/openmm.git" did not exist on "4233befd9d173e5035f7b905922790fb1ee4c78f"
Use TMA to optimize internode combine. (#287)
* Let forwarders use a dedicated SM
* Shuffle rdma idx
* Sender use TMA.
* Adjust the tuning chunk size.
* Modify NVL chunk layout.
* Update some combine config.
* Small lint
* Minor fix
* Overlap TMA store
---------
Co-authored-by:
Chenggang Zhao <chenggangz@deepseek.com>
Showing
Please register or sign in to comment