".github/vscode:/vscode.git/clone" did not exist on "b50c818623049e533ca5f8d07dfe5ead3ca08d9a"
Implement MI200 FP16 Denorm fix inside threadwise copy (#191)
* start convert
* using buffer load
* add kernel transfer fun
* using asm for transfer
* add transpose_half_to_bhalf_2x2
* add TypeMap struct
* add LDSDataType to v2r3 and v2r4r2
* change convert fun name
* remove asm in half transfer to bhalf
* fix bug for type_convert
* cshuffle_v1 add LDSDataType
* add ldstype for gridegemm v2r4
* add lds datat ype to v3r1 2 3
* init complete
* fix function name
* remove comments
* format
* fix for merge develop
Co-authored-by:
ltqin <letaoqin@amd.com>
Showing
Please register or sign in to comment