- 18 Apr, 2026 9 commits
-
-
one authored
-
one authored
-
one authored
-
one authored
* Fix some lint warnings * Exclude some paths in cpplint * Fix some tests and formatting
-
one authored
-
one authored
-
one authored
Adds opt-in deterministic training mode to SuperBench's PyTorch model benchmarks. When enabled --enable-determinism. PyTorch deterministic algorithms are enforced, and per-step numerical fingerprints (loss, activation means) are recorded as metrics. These can be compared across runs using the existing sb result diagnosis pipeline to verify bit-exact reproducibility — useful for hardware validation and platform comparison. Flags added - --enable-determinism --check-frequency: Number of steps after which you want the metrics to be recorded --deterministic-seed Changes - Updated pytorch_base.py to handle deterministic settings, logging. Added a new example script: pytorch_deterministic_example.py Added a test file: test_pytorch_determinism_all.py to verify everything works as expected. Usage - Step 1: Run 1 - Run with --enable-determinism and the necessary metrics will be recorded in the results-summary.jsonl file Step 2: Generate the baseline file from the Run 1 results using - sb result generate-baseline Step 3: Run 2 - Run with --enable-determinism and the necessary metrics will be recorded in the results-summary.jsonl file on a different machine (or the same machine) Step 4: Run diagnosis on the results generated from the 2 runs using the - sb result diagnosis command Note - 1. Make sure all the parameters are constant between the 2 runs 2. Running the diagnosis command requires the rules.yaml file --------- Co-authored-by:
Aishwarya Tonpe <aishwarya.tonpe25@gmail.com> Co-authored-by:
Ubuntu <rdadmin@HPCPLTNODE0.n3kgq4m0lhoednrx3hxtad2nha.cdmx.internal.cloudapp.net>
-
one authored
-
one authored
-
- 17 Apr, 2026 4 commits
- 15 Apr, 2026 1 commit
-
-
one authored
-
- 02 Apr, 2026 9 commits
- 01 Apr, 2026 7 commits
- 31 Mar, 2026 1 commit
-
-
one authored
-
- 27 Mar, 2026 1 commit
-
-
one authored
-
- 25 Mar, 2026 1 commit
-
-
one authored
-
- 20 Mar, 2026 1 commit
-
-
one authored
-
- 19 Mar, 2026 3 commits
-
-
one authored
-
one authored
- Added Platform.DTK in the microbenchmark framework. - Introduced new DTK hipblaslt benchmark class and corresponding tests. - Updated Dockerfile to include hipblaslt-bench and its permissions. - Registered DTK benchmarks in the benchmark registry for various performance tests. - Enhanced GPU detection logic to recognize HYGON GPUs. This update improves the benchmarking capabilities for DTK, ensuring compatibility and performance testing across platforms.
-
one authored
- Update rocm_commom.cmake for CMake>=3.24 - Prevent isolation build - Add BabelStream as a submodule - Update dockerignore
-
- 17 Mar, 2026 1 commit
-
-
one authored
-
- 11 Mar, 2026 1 commit
-
-
Hongtao Zhang authored
## Summary - Upgrade Intel Memory Latency Checker from v3.11 to v3.12 in rocm5.0.x.dockerfile - Aligns with other dockerfiles that already use v3.12 Co-authored-by:
Hongtao Zhang <hongtaozhang@microsoft.com> Co-authored-by:
Claude Opus 4.5 <noreply@anthropic.com>
-
- 04 Feb, 2026 1 commit
-
-
WenqingLan1 authored
Updated 3rd party submodule gpu-burn to newest version for implementation & doc support for cuda13.0. Co-authored-by:guoshzhao <guzhao@microsoft.com>
-