Example for conv2d backward weight fp16 (#106)
* add wrw reference
* start device
* raw not split version
* run simple example
* start to use atomic add
* simple transform result correct
* first version that can run
* fix atomic and set operator choice
* add check split-k
* format
* change input parameter
* add pad for t total
* rename example index
Co-authored-by:
ltqin <letaoqin@amd.com>
Showing
Please register or sign in to comment