Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
one
TransferBench
Commits
c5197729
Unverified
Commit
c5197729
authored
Nov 28, 2023
by
gilbertlee-amd
Committed by
GitHub
Nov 28, 2023
Browse files
v1.38 Adding missing __threadfence_system() (#70)
parent
20ada430
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
7 additions
and
1 deletion
+7
-1
CHANGELOG.md
CHANGELOG.md
+5
-0
src/include/EnvVars.hpp
src/include/EnvVars.hpp
+1
-1
src/include/Kernels.hpp
src/include/Kernels.hpp
+1
-0
No files found.
CHANGELOG.md
View file @
c5197729
...
...
@@ -3,6 +3,11 @@
Documentation for TransferBench is available at
[
https://rocm.docs.amd.com/projects/TransferBench
](
https://rocm.docs.amd.com/projects/TransferBench
)
.
## v1.38
### Fixes
*
Adding missing threadfence which could cause non-fine-grained Transfers to report higher speeds
## v1.37
### Changes
...
...
src/include/EnvVars.hpp
View file @
c5197729
...
...
@@ -29,7 +29,7 @@ THE SOFTWARE.
#include "Compatibility.hpp"
#include "Kernels.hpp"
#define TB_VERSION "1.3
7
"
#define TB_VERSION "1.3
8
"
extern
char
const
MemTypeStr
[];
extern
char
const
ExeTypeStr
[];
...
...
src/include/Kernels.hpp
View file @
c5197729
...
...
@@ -247,6 +247,7 @@ GpuReduceKernel(SubExecParam* params)
__syncthreads
();
if
(
threadIdx
.
x
==
0
)
{
__threadfence_system
();
p
.
stopCycle
=
wall_clock64
();
p
.
startCycle
=
startCycle
;
p
.
xccId
=
xccId
;
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment