- 05 Jan, 2026 1 commit
-
-
gilbertlee-amd authored
* Adding System singleton to support multi-node (communication and topology) * Adding multi-node parsing, rank and device wildcard expansion * Adding multi-node topology, and various support functions * Adding multi-node consistency validation of Config and Transfers * Introducing SINGLE_KERNEL=1 to Makefile to speed up compilation during development * Updating CHANGELOG. Overhauling wildcard parsing. Adding dryrun * Client refactoring. Introduction of tabular formatted results and a2a multi-rank preset * Adding MPI support into CMakeFiles * Cleaning up multi-node topology using TableHelper * Reducing compile time by removing some kernel variants * Updating documentation. Adding nicrings preset * Adding NIC_FILTER to allow NIC device filtering via regex * Updating supported memory types * Fixing P2P preset, and adding some extra memIndex utility functions
-
- 04 Sep, 2025 1 commit
-
-
gilbertlee-amd authored
* Added BLOCKSIZES to a2asweep preset to allow sweeping over threadblock sizes * Fixing src initialization when using BYTE_OFFSET * Adding FILL_COMPRESS functionality to allow for different input data patterns * Updating CHANGELOG regarding GFX_BLOCKSIZE limit increase to 1024
-
- 09 Jun, 2025 1 commit
-
-
gilbertlee-amd authored
* Adding non-temporal loads and stores via GFX_TEMPORAL * Adding additional summary details to a2a preset * Add SHOW_MIN_ONLY for a2asweep preset * Adding new P CPU memory type which is indexed by closest GPU
-
- 28 Feb, 2025 1 commit
-
-
gilbertlee-amd authored
Co-authored-by:Mustafa Abduljabbar <mustafa.abduljabbar@amd.com>
-