Commit 2126b59a authored by TiagoMAntunes's avatar TiagoMAntunes
Browse files

Fixed asynchronous streams in column reduce kernel call

parent c96f8863
...@@ -146,7 +146,7 @@ void fmoe_cuda_linear_backward_impl( ...@@ -146,7 +146,7 @@ void fmoe_cuda_linear_backward_impl(
if (has_bias) { if (has_bias) {
column_reduce column_reduce
<<<grid_threads, block_threads, sizeof(scalar_t)*1024, smgr->stream(0)>>> <<<grid_threads, block_threads, sizeof(scalar_t)*1024, smgr->stream(i)>>>
( (
grad_output_buf + ptr * out_feat, grad_output_buf + ptr * out_feat,
grad_bias + i * out_feat, grad_bias + i * out_feat,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment