Commit 2126b59a authored by TiagoMAntunes's avatar TiagoMAntunes
Browse files

Fixed asynchronous streams in column reduce kernel call

parent c96f8863
......@@ -146,7 +146,7 @@ void fmoe_cuda_linear_backward_impl(
if (has_bias) {
column_reduce
<<<grid_threads, block_threads, sizeof(scalar_t)*1024, smgr->stream(0)>>>
<<<grid_threads, block_threads, sizeof(scalar_t)*1024, smgr->stream(i)>>>
(
grad_output_buf + ptr * out_feat,
grad_bias + i * out_feat,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment