Backward weight v4r4r2 with xdlops (#18)
* start * modify transformat * modify device convolutiion * modify host * added host conv bwd and wrw * remove bwd, seperate wrw * clean * hacall k to zero * out log * fixed * fixed * change to (out in wei) * input hack * hack to out * format * fix by comments * change wei hacks(wei transform has not merge) * fix program once issue * fix review comment * fix vector load issue * tweak Co-authored-by:ltqin <letaoqin@amd.com> Co-authored-by:
Jing Zhang <jizhan@amd.com> Co-authored-by:
Chao Liu <chao.liu2@amd.com>
Showing
Please register or sign in to comment