- 08 Aug, 2025 1 commit
-
-
gilbertlee-amd authored
* Fixing issue with P memory type and use of DMA subexecutor * CMake builds require explicit opt-in by setting NIC_EXEC_ENABLE=1 * Removing self-GPU check for DMA engine copies * [BUILD] Add new GPU targets and switch to amdclang++ (#187) * [BUILD] Add gfx950, gfx1150, and gfx1151 targets * [BUILD] Modify CMake to use amdclang++ * [BUILD] Modify Makefile to use amdclang++ * [GIT] Updated CHANGELOG and .gitignore * Adding HBM testing to healthcheck preset * Tweaking HBM tests to occur first, and provide more info during VERBOSE=1 * Fixing timing reporting issues with NUM_SUBITERATIONS * [BUILD] Simplify Makefile (#190) * Combines steps for compilation and linking * Does not rebuild if no change to source code * Updating CHANGELOG --------- Co-authored-by:Nilesh M Negi <Nilesh.Negi@amd.com>
-
- 09 Jun, 2025 1 commit
-
-
gilbertlee-amd authored
* Adding non-temporal loads and stores via GFX_TEMPORAL * Adding additional summary details to a2a preset * Add SHOW_MIN_ONLY for a2asweep preset * Adding new P CPU memory type which is indexed by closest GPU
-
- 28 Feb, 2025 1 commit
-
-
gilbertlee-amd authored
Co-authored-by:Mustafa Abduljabbar <mustafa.abduljabbar@amd.com>
-
- 30 Jan, 2025 1 commit
-
-
gilbertlee-amd authored
-
- 24 Jan, 2025 1 commit
-
-
gilbertlee-amd authored
Co-authored-by:Mustafa Abduljabbar <mustafa.abduljabbar@amd.com>
-
- 21 Jan, 2025 1 commit
-
-
gilbertlee-amd authored
Adding NIC execution capabilities, various bug fixes introduced by header-only-library refactor --------- Co-authored-by:Mustafa Abduljabbar <mustafa.abduljabbar@amd.com>
-
- 13 Dec, 2024 3 commits
- 12 Dec, 2024 1 commit
-
-
srawat authored
-
- 05 Dec, 2024 1 commit
-
-
gilbertlee-amd authored
-
- 02 Dec, 2024 1 commit
-
-
gilbertlee-amd authored
-
- 28 Nov, 2024 1 commit
-
-
gilbertlee-amd authored
* Removing C++20 dependencies, modified how version is reported * Changing GFX_SINGLE_TEAM=0 by default
-
- 26 Nov, 2024 1 commit
-
-
gilbertlee-amd authored
-
- 22 Nov, 2024 1 commit
-
-
gilbertlee-amd authored
-
- 21 Nov, 2024 2 commits
-
-
akolliasAMD authored
-
gilbertlee-amd authored
-
- 11 Nov, 2024 1 commit
-
-
gilbertlee-amd authored
-
- 09 Oct, 2024 1 commit
-
-
gilbertlee-amd authored
* Adding USE_HSA_DMA to switch to using hsa_amd_memory_async_copy in lieu of hipMemcpyAsync * Adding USE_GPU_DMA for A2A benchmark * Adding largeBAR check and fix for 0-hop GPU-CPU links
-
- 15 Aug, 2024 1 commit
-
-
gilbertlee-amd authored
* Fixing potential out-of-bounds write during topology detection * Fixing CU_MASK for multi-XCD GPUs * Adding sub-iterations via NUM_SUBITERATIONS * Adding support for variable subexecutor Transfers * Adding healthcheck preset
-
- 03 Apr, 2024 1 commit
-
-
gilbertlee-amd authored
* Adding pcopy benchmark, fixing CPU kernel on null destination
-
- 08 Mar, 2024 1 commit
-
-
gilbertlee-amd authored
-
- 02 Feb, 2024 1 commit
-
-
gilbertlee-amd authored
* Adding targeted DMA engine support * Fixing CUDA compilation for H100
-
- 09 Jan, 2024 2 commits
-
-
gilbertlee-amd authored
-
gilbertlee-amd authored
-
- 14 Dec, 2023 1 commit
-
-
gilbertlee-amd authored
-
- 05 Dec, 2023 1 commit
-
-
gilbertlee-amd authored
* v1.45 New GFX kernel
-
- 01 Dec, 2023 1 commit
-
-
gilbertlee-amd authored
-
- 30 Nov, 2023 4 commits
-
-
gilbertlee-amd authored
-
gilbertlee-amd authored
-
gilbertlee-amd authored
* v1.41 Adding schmoo benchmark, fixing timing reports for variable-iteration modes
-
gilbertlee-amd authored
-
- 29 Nov, 2023 1 commit
-
-
gilbertlee-amd authored
-
- 28 Nov, 2023 1 commit
-
-
gilbertlee-amd authored
-
- 24 Nov, 2023 2 commits
-
-
gilbertlee-amd authored
-
gilbertlee-amd authored
-
- 22 Nov, 2023 1 commit
-
-
gilbertlee-amd authored
-
- 07 Nov, 2023 1 commit
-
-
gilbertlee-amd authored
-
- 30 Oct, 2023 1 commit
-
-
gilbertlee-amd authored
-
- 19 Oct, 2023 1 commit
-
-
gilbertlee-amd authored
-