Commits · 47d4a79d5868a7173fc580a55e16c8486a6ce32f · tsoc / superbenchmark

18 Apr, 2026 1 commit

Benchmark: Model benchmark - deterministic training support (#731) (#2) · 47d4a79d

one authored Apr 18, 2026



Adds opt-in deterministic training mode to SuperBench's PyTorch model
benchmarks. When enabled --enable-determinism. PyTorch deterministic
algorithms are enforced, and per-step numerical fingerprints (loss,
activation means) are recorded as metrics. These can be compared across
runs using the existing sb result diagnosis pipeline to verify bit-exact
reproducibility — useful for hardware validation and platform
comparison.
 
Flags added - 

--enable-determinism
--check-frequency: Number of steps after which you want the metrics to
be recorded
--deterministic-seed

Changes - 

Updated pytorch_base.py to handle deterministic settings, logging.
Added a new example script: pytorch_deterministic_example.py
Added a test file: test_pytorch_determinism_all.py to verify everything
works as expected.

Usage - 

Step 1: Run 1 - Run with --enable-determinism and the necessary metrics
will be recorded in the results-summary.jsonl file
Step 2: Generate the baseline file from the Run 1 results using - sb
result generate-baseline
Step 3: Run 2 - Run with --enable-determinism and the necessary metrics
will be recorded in the results-summary.jsonl file on a different
machine (or the same machine)
Step 4: Run diagnosis on the results generated from the 2 runs using the
- sb result diagnosis command

Note - 
1. Make sure all the parameters are constant between the 2 runs 
2. Running the diagnosis command requires the rules.yaml file

---------
Co-authored-by: Aishwarya Tonpe <aishwarya.tonpe25@gmail.com>
Co-authored-by: Ubuntu <rdadmin@HPCPLTNODE0.n3kgq4m0lhoednrx3hxtad2nha.cdmx.internal.cloudapp.net>

47d4a79d

28 Jan, 2026 1 commit

CI/CD - Fix Image build for cuda11.1.1 (#771) · 8b805d90

Hongtao Zhang authored Jan 28, 2026



**Description**

- When building the CUDA 11.1.1 image, pip (Python 3.8) cannot find a
pre-built wheel for the latest wandb release (v0.23.1). As a result, pip
attempts to build wandb from source. However, the build fails because
the image does not have Go installed, which is required for building
wandb from source. Then the error appears.

**Solution**

- For the CUDA 11.1.1 build, install the required build tools (e.g., Go,
Rust, and Cargo) needed for wandb.

---------
Co-authored-by: Hongtao Zhang <hongtaozhang@microsoft.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

8b805d90

28 Nov, 2024 1 commit

Benchmarks - Add LLaMA-2 Models (#668) · 249e21c1

pdr authored Nov 27, 2024

Added llama benchmark - training and inference in accordance with the
existing pytorch models implementation like gpt2, lstm etc.

- added llama fp8 unit test for better code coverage, to reduce memory
required
- updated transformers version >= 4.28.0 for LLamaConfig
- set tokenizers version <= 0.20.3 to avoid 0.20.4 version
[issues](https://github.com/huggingface/tokenizers/issues/1691

) with
py3.8
- added llama2 to tensorrt
- llama2 tests not added to test_tensorrt_inference_performance.py due
to large memory requirement for worker gpu. tests validated separately
on gh200

---------
Co-authored-by: dpatlolla <dpatlolla@microsoft.com>

249e21c1

27 Nov, 2024 1 commit

CI/CD - Upgrade dependency versions in pipeline (#671) · 96f5ccea

Yifan Xiong authored Nov 26, 2024



Upgrade dependency versions in Azure pipeline:

* Remove Python 3.6 and add Python 3.10 for cpu-unit-test
* Upgrade CUDA from 11.1 to 12.4 for cuda-unit-test
* Update labels accordingly

---------
Co-authored-by: Dilip Patlolla <dilipreddi@gmail.com>

96f5ccea

30 Dec, 2022 1 commit

Executor - Add stdout logging util module and enable real-time logging flushing in executor (#445) · 9dfefce3

Yuting Jiang authored Dec 30, 2022

**Description**
Add stdout logging util module and enable real-time logging flushing in executor

**Major Revision**
- Add stdout logging util module to redirect stdout into file log
- enable stdout logging in executor to write benchmark output into both stdout and file `sb-bench.log`
- enable real-time log flushing in run_command of microbenchmarks through config `log_flushing`

**Minor Revision**
- add log_n_step args to enable regular step time log in model benchmarks 
- udpate related docs

9dfefce3

06 Sep, 2022 1 commit

Release - SuperBench v0.6.0 (#409) · 63e9b2d1

Yifan Xiong authored Sep 06, 2022



**Description**

Cherry-pick bug fixes from v0.6.0 to main.

**Major Revisions**

* Enable latency test in ib traffic validation distributed benchmark (#396)
* Enhance parameter parsing to allow spaces in value (#397)
* Update apt packages in dockerfile (#398)
* Upgrade colorlog for NO_COLOR support (#404)
* Analyzer - Update error handling to support exit code of sb result diagnosis (#403)
* Analyzer - Make baseline file optional in data diagnosis and fix bugs (#399)
* Enhance timeout cleanup to avoid possible hanging (#405)
* Auto generate ibstat file by pssh (#402)
* Analyzer - Format int type and unify empty value to N/A in diagnosis output file (#406)
* Docs - Upgrade version and release note (#407)
* Docs - Fix issues in document (#408)
Co-authored-by: Yang Wang <yangwang1@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>

63e9b2d1

04 Aug, 2022 1 commit

Gracefully exit when timeout (#383) · 9b8df883

Yifan Xiong authored Aug 04, 2022

* Gracefully exit when timeout, add corresponding log and return code.
* Set minimum timeout to 1 minute and enlarge Ansible timeout.

9b8df883

01 Apr, 2022 1 commit

Benchmarks: Add Feature - Provide option to save raw data into file. (#333) · 6d895da8

guoshzhao authored Apr 01, 2022

**Description**
Use config `log_raw_data` to control whether log the raw data into file or not. The default value is `no`. We can set it as `yes` for some particular benchmarks to save the raw data into file, such as NCCL/RCCL test.

6d895da8

19 Jan, 2022 1 commit
- Benchmarks: Add Feature - Add percentile metrics for ort and pytorch inference benchmarks (#283) · fd2bc9e0
  guoshzhao authored Jan 19, 2022
```
**Description**
Add 50th, 90th, 95th, 99th, 99.9th latency metrics for ORT and pytorch inference benchmarks.
```
  fd2bc9e0
18 Jan, 2022 1 commit

CLI - Add command sb benchmark [list,list-parameters] (#279) · f7ffc545

Yifan Xiong authored Jan 18, 2022

__Description__

Add command `sb benchmark list` and `sb benchmark list-parameters` to support listing all optional parameters for benchmarks.

<details>
<summary>Examples</summary>
<pre>
$ sb benchmark list -n [a-z]+-bw -o table
Result
--------
mem-bw
nccl-bw
rccl-bw
</pre>
<pre>
$ sb benchmark list-parameters -n mem-bw
=== mem-bw ===
optional arguments:
  --bin_dir str         Specify the directory of the benchmark binary.
  --duration int        The elapsed time of benchmark in seconds.
  --mem_type str [str ...]
                        Memory types to benchmark. E.g. htod dtoh dtod.
  --memory str          Memory argument for bandwidthtest. E.g. pinned unpinned.
  --run_count int       The run count of benchmark.
  --shmoo_mode          Enable shmoo mode for bandwidthtest.
default values:
{'bin_dir': None,
 'duration': 0,
 'mem_type': ['htod', 'dtoh'],
 'memory': 'pinned',
 'run_count': 1}
</pre>
</details>

__Major Revisions__
* Add `sb benchmark list` to list benchmarks matching given name.
* Add `sb benchmark list-parameters` to list parameters for benchmarks which match given name.

__Minor Revisions__
* Sort format help text for argparse.

f7ffc545

07 Dec, 2021 1 commit
- Benchmarks: Add Feature - Add return_code metric into result (#256) · 44f0270e
  guoshzhao authored Dec 07, 2021
```
**Description**
Add return_code metric into result and revise unit tests.
```
  44f0270e
02 Dec, 2021 1 commit

Benchmarks: Add Feature - Add 'ignore_invalid' option when register benchmarks. (#247) · 371fd61c

guoshzhao authored Dec 02, 2021

**Description**
If `ignore_invalid` is True, and 'required' arguments are not set when register the benchmark, the arguments should be provided by user in config and skip the arguments checking.

371fd61c

07 Jun, 2021 1 commit
- Benchmarks: Fix Bug - Fix OOM issue when run pytorch models sequentially. (#93) · 03b41be1
  guoshzhao authored Jun 07, 2021
```
* Clean up the cache.
```
  03b41be1
14 Apr, 2021 1 commit
- Benchmarks: Add Feature - Add interface to get all predefine parameters of all benchmarks. (#56) · fb850af7
  guoshzhao authored Apr 14, 2021
```
* Benchmarks: Add Feature - Add interface to get all predefine parameters of all benchmarks.
```
  fb850af7
12 Apr, 2021 1 commit
- add _post_process() interface. (#40) · 0172968f
  guoshzhao authored Apr 12, 2021
```
Co-authored-by: Guoshuai Zhao <guzhao@microsoft.com>
```
  0172968f
09 Apr, 2021 1 commit
- Benchmarks: Fix Bug - Fix bug when validate the raw data format. (#35) · 02eef9ca
  guoshzhao authored Apr 09, 2021
```
* fix raw data validation bug.
* address comments.
Co-authored-by: Guoshuai Zhao <guzhao@microsoft.com>
```
  02eef9ca
08 Apr, 2021 1 commit

Benchmarks: Code Revision - Revise BenchmarkRegistry interfaces for... · 923ce277

guoshzhao authored Apr 08, 2021


Benchmarks: Code Revision - Revise BenchmarkRegistry interfaces for integration with executor. (#33)

* revise BenchmarkRegistry interfaces.
* address comments
Co-authored-by: Guoshuai Zhao <guzhao@microsoft.com>

923ce277

24 Feb, 2021 1 commit
- Benchmarks: Initialization - Add base class, registry, and result (#1) · 4c87a3e4
  guoshzhao authored Feb 24, 2021
```
* benchmarks init.
Co-authored-by: Guoshuai Zhao <guzhao@microsoft.com>
```
  4c87a3e4