src/array/cuda/gather_mm.cu · eabcc58e41c0b94316f60ea74ba8bc5a7b0e2096 · OpenDAS / dgl

[Bug][Feature] Added cublasGemm<__half> specialization (#3988) (#4029) · eabcc58e

ndickson-nvidia authored Jun 07, 2022

* * Added specialization of cublasGemm function for `__half` type, to try to address https://github.com/dmlc/dgl/issues/3988



* * Added USE_FP16 guard

* * Added test cases to test_segment_mm, to test newly-added FP16 specialization of cublasGemm

* * Replaced for loop in test_segment_mm with pytest.mark.parametrize, as recommended
Co-authored-by: Xin Yao <xiny@nvidia.com>

eabcc58e

gather_mm.cu 18.6 KB

Replace gather_mm.cu