Commits · 338eb5a7c94ad5c48d5bcacaf32f42e42832ca71 · tsoc / superbenchmark

18 Apr, 2026 9 commits

Update algolia settings · 338eb5a7
one authored Apr 18, 2026

338eb5a7
Update docs domain · 9f284593
one authored Apr 18, 2026

9f284593
Update docs site URL for custom domain · 2c19ab76
one authored Apr 18, 2026

2c19ab76

one authored Apr 18, 2026

* Fix some lint warnings
* Exclude some paths in cpplint
* Fix some tests and formatting

b31acf90

Update ci workflows · 37a70cbe
one authored Apr 18, 2026

37a70cbe
Format python code on branch dtk · 2bf01d5e
one authored Apr 18, 2026

2bf01d5e

Benchmark: Model benchmark - deterministic training support (#731) (#2) · 47d4a79d

one authored Apr 18, 2026



Adds opt-in deterministic training mode to SuperBench's PyTorch model
benchmarks. When enabled --enable-determinism. PyTorch deterministic
algorithms are enforced, and per-step numerical fingerprints (loss,
activation means) are recorded as metrics. These can be compared across
runs using the existing sb result diagnosis pipeline to verify bit-exact
reproducibility — useful for hardware validation and platform
comparison.
 
Flags added - 

--enable-determinism
--check-frequency: Number of steps after which you want the metrics to
be recorded
--deterministic-seed

Changes - 

Updated pytorch_base.py to handle deterministic settings, logging.
Added a new example script: pytorch_deterministic_example.py
Added a test file: test_pytorch_determinism_all.py to verify everything
works as expected.

Usage - 

Step 1: Run 1 - Run with --enable-determinism and the necessary metrics
will be recorded in the results-summary.jsonl file
Step 2: Generate the baseline file from the Run 1 results using - sb
result generate-baseline
Step 3: Run 2 - Run with --enable-determinism and the necessary metrics
will be recorded in the results-summary.jsonl file on a different
machine (or the same machine)
Step 4: Run diagnosis on the results generated from the 2 runs using the
- sb result diagnosis command

Note - 
1. Make sure all the parameters are constant between the 2 runs 
2. Running the diagnosis command requires the rules.yaml file

---------
Co-authored-by: Aishwarya Tonpe <aishwarya.tonpe25@gmail.com>
Co-authored-by: Ubuntu <rdadmin@HPCPLTNODE0.n3kgq4m0lhoednrx3hxtad2nha.cdmx.internal.cloudapp.net>

47d4a79d

Format python code · 8c28b69a
one authored Apr 18, 2026

8c28b69a
Runner: validate MPI bind-to option and cover configurable bind-to in tests · 655519cb
one authored Apr 18, 2026

655519cb

17 Apr, 2026 4 commits
- Improve launch bounds for gpu-copy · eea26d0d
  one authored Apr 17, 2026
  
  eea26d0d
- Update ansible playbooks to suppress warnings · 2ea51c1d
  one authored Apr 17, 2026
```
- Get ansible_user_dir from facts
- Get hostname from facts
- Update NODE_RANK expression
```
  2ea51c1d
- Merge pull request #1 from alephpiece/one/deploy-docs · ad7ae5c4
  one authored Apr 17, 2026
```
Configure GitHub Pages
```
  ad7ae5c4
- Add --container-name for custom docker container name · e1d791d2
  one authored Apr 17, 2026
  
  e1d791d2
15 Apr, 2026 1 commit
- Update GPU vendors · f57d86f4
  one authored Apr 15, 2026
  
  f57d86f4
02 Apr, 2026 9 commits
- Add bw1000 config files (beta) · 49a4389b
  one authored Apr 02, 2026
  
  49a4389b
- Update docker volumes in deploy.yaml · 53e0e494
  one authored Apr 02, 2026
  
  53e0e494
- Update dtk platform detection · 42bc5b87
  one authored Apr 02, 2026
  
  42bc5b87
- Add dtk dockerfile for docker 18 · 4599cd69
  one authored Apr 02, 2026
  
  4599cd69
- Update docs · b8b080e2
  one authored Apr 02, 2026
  
  b8b080e2
- Re-implement kernel launch · 04564997
  one authored Apr 02, 2026
  
  04564997
- Fix runner test · 05cdf5d6
  one authored Apr 02, 2026
  
  05cdf5d6
- Use env file in docker instead of /tmp · c1bc12ce
  one authored Apr 02, 2026
  
  c1bc12ce
- Add topo mapping for dtk26.04 · c128dabb
  one authored Apr 02, 2026
  
  c128dabb
01 Apr, 2026 7 commits
- Update rocHPCG metrics · e514815d
  one authored Apr 01, 2026
  
  e514815d
- Add metric sorters for RCCL tests and rocHPCG · 05e137be
  one authored Apr 01, 2026
  
  05e137be
- Fix rocHPCG metric extraction · 742f203d
  one authored Apr 01, 2026
  
  742f203d
- Convert rochpcg script patch into shell script · b623c7e9
  one authored Apr 01, 2026
  
  b623c7e9
- Refactor environment variable handling in runner.py · a10c3e15
  one authored Apr 01, 2026
  
  a10c3e15
- Update dtk dockerfile to use venv · 325db60e
  one authored Apr 01, 2026
  
  325db60e
- Add gpu-hpcg metrics · 2056d7fa
  one authored Apr 01, 2026
  
  2056d7fa
31 Mar, 2026 1 commit
- Update dtk docker image · 4f69c7de
  one authored Mar 31, 2026
  
  4f69c7de
27 Mar, 2026 1 commit
- MicroBenchmark: rocHPCG · e4c2bd4c
  one authored Mar 27, 2026
  
  e4c2bd4c
25 Mar, 2026 1 commit
- Improve DTK gemm-flops · 211e63c7
  one authored Mar 25, 2026
  
  211e63c7
20 Mar, 2026 1 commit
- Fix paths, deps, envs in dockerfile · df0bde6c
  one authored Mar 20, 2026
  
  df0bde6c
19 Mar, 2026 3 commits

Migrate gpu-stream to BabelStream v5.0 · d4051602
one authored Mar 19, 2026

d4051602

Enhance DTK platform support and GPU detection · 1a57f2d6

one authored Mar 19, 2026

- Added Platform.DTK in the microbenchmark framework.
- Introduced new DTK hipblaslt benchmark class and corresponding tests.
- Updated Dockerfile to include hipblaslt-bench and its permissions.
- Registered DTK benchmarks in the benchmark registry for various performance tests.
- Enhanced GPU detection logic to recognize HYGON GPUs.

This update improves the benchmarking capabilities for DTK, ensuring compatibility and performance testing across platforms.

1a57f2d6

Update DTK dockerfile and microbenchmarks · c4f39919

one authored Mar 19, 2026

- Update rocm_commom.cmake for CMake>=3.24
- Prevent isolation build
- Add BabelStream as a submodule
- Update dockerignore

c4f39919

17 Mar, 2026 1 commit
- Add a dtk dockerfile · 0fdfe4c3
  one authored Mar 17, 2026
  
  0fdfe4c3
11 Mar, 2026 1 commit

Microbenchmark: upgrade Intel MLC to v3.12 in rocm5.0.x (#784) · 6b8e8104

Hongtao Zhang authored Mar 10, 2026



## Summary
- Upgrade Intel Memory Latency Checker from v3.11 to v3.12 in
rocm5.0.x.dockerfile
- Aligns with other dockerfiles that already use v3.12
Co-authored-by: Hongtao Zhang <hongtaozhang@microsoft.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

6b8e8104

04 Feb, 2026 1 commit

Submodule Update: update gpu-burn to newest version (#761) · 575859be

WenqingLan1 authored Feb 03, 2026



Updated 3rd party submodule gpu-burn to newest version for
implementation & doc support for cuda13.0.
Co-authored-by: guoshzhao <guzhao@microsoft.com>

575859be