Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
apex
Commits
6d6f0bc2
Commit
6d6f0bc2
authored
Mar 08, 2019
by
Simon Layton
Browse files
Simplify noop exit condition
parent
a2799893
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
16 deletions
+2
-16
csrc/multi_tensor_sgd_kernel.cu
csrc/multi_tensor_sgd_kernel.cu
+2
-16
No files found.
csrc/multi_tensor_sgd_kernel.cu
View file @
6d6f0bc2
...
...
@@ -38,13 +38,8 @@ struct SGDFunctor
bool
nesterov
,
bool
first_run
)
{
__shared__
int
noop_smem
;
if
(
threadIdx
.
x
==
0
)
noop_smem
=
*
noop_gmem
;
__syncthreads
();
if
(
noop_smem
==
1
)
return
;
// Early exit if we don't need to do anything
if
(
*
noop_gmem
)
return
;
int
tensor_loc
=
tl
.
block_to_tensor
[
blockIdx
.
x
];
int
chunk_idx
=
tl
.
block_to_chunk
[
blockIdx
.
x
];
...
...
@@ -126,15 +121,6 @@ struct SGDFunctor
}
}
}
// *noop_gmem = 1 is NOT guaranteed to be seen immediately by thread 0. I wonder if
// we can rig block-wide and grid-wide short-circuiting with only one syncthreads.
// It's possible we can just lean on the cache (no smem or syncs) and still be fast.
if
(
threadIdx
.
x
==
0
)
noop_smem
=
*
noop_gmem
;
__syncthreads
();
if
(
noop_smem
==
1
)
break
;
}
}
};
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment