Commits · 036c4712b1256219e26fd9b5740bba8d6d6596c7 · tsoc / superbenchmark

25 Mar, 2026 1 commit

Benchmark: Model benchmark - deterministic training support (#731) · 036c4712

Aishwarya Tonpe authored Mar 25, 2026

Adds opt-in deterministic training mode to SuperBench's PyTorch model
benchmarks. When enabled --enable-determinism. PyTorch deterministic
algorithms are enforced, and per-step numerical fingerprints (loss,
activation means) are recorded as metrics. These can be compared across
runs using the existing sb result diagnosis pipeline to verify bit-exact
reproducibility — useful for hardware validation and platform
comparison.

Flags added -

--enable-determinism
--check-frequency: Number of steps after which you want the metrics to
be recorded
--deterministic-seed

Changes -

Updated pytorch_base.py to handle deterministic settings, logging.
Added a new example script: pytorch_deterministic_example.py
Added a test file: test_pytorch_determinism_all.py to verify everything
works as expected.

Usage -

Step 1: Run 1 - Run with --enable-determinism and the necessary metrics
will be recorded in the results-summary.jsonl file
Step 2: Generate the baseline file from the Run 1 results using - sb
result generate-baseline
Step 3: Run 2 - Run with --enable-determinism and the necessary metrics
will be recorded in the results-summary.jsonl file on a different
machine (or the same machine)
Step 4: Run diagnosis on the results generated from the 2 runs using the
- sb result diagnosis command

Note -
1. Make sure all the parameters are constant between the 2 runs
2. Running the diagnosis command requires the rules.yaml file

---------
Co-authored-by: Ubuntu <rdadmin@HPCPLTNODE0.n3kgq4m0lhoednrx3hxtad2nha.cdmx.internal.cloudapp.net>

036c4712

29 Sep, 2025 1 commit

Benchmark: Model benchmark - add option to exclude data copy time in model benchmarks (#734) · 76066b6d

Yuting Jiang authored Sep 29, 2025

**Description**
add option to exclude data copy time in model benchmarks.

**Major Revision**
- add an option --no_copy
- move start time after data copy finish

76066b6d

28 Nov, 2024 1 commit

Benchmarks - Add LLaMA-2 Models (#668) · 249e21c1

pdr authored Nov 27, 2024

Added llama benchmark - training and inference in accordance with the
existing pytorch models implementation like gpt2, lstm etc.

- added llama fp8 unit test for better code coverage, to reduce memory
required
- updated transformers version >= 4.28.0 for LLamaConfig
- set tokenizers version <= 0.20.3 to avoid 0.20.4 version
[issues](https://github.com/huggingface/tokenizers/issues/1691

) with
py3.8
- added llama2 to tensorrt
- llama2 tests not added to test_tensorrt_inference_performance.py due
to large memory requirement for worker gpu. tests validated separately
on gh200

---------
Co-authored-by: dpatlolla <dpatlolla@microsoft.com>

249e21c1

16 Jun, 2023 1 commit
- Benchmarks - Update outdate references (#539) · e909ddd0
  guoshzhao authored Jun 16, 2023
```
**Description**
Update 404 outdate reference links.
```
  e909ddd0
25 Mar, 2023 1 commit

Benchmarks - Support TE FP8 in BERT/GPT2 models (#496) · c88c9709

Yifan Xiong authored Mar 25, 2023

Support Transformer Engine FP8 in existing PyTorch BERT/GPT2 models by
converting linear/layernorm to TE layers.

c88c9709

30 Dec, 2022 1 commit

Executor - Add stdout logging util module and enable real-time logging flushing in executor (#445) · 9dfefce3

Yuting Jiang authored Dec 30, 2022

**Description**
Add stdout logging util module and enable real-time logging flushing in executor

**Major Revision**
- Add stdout logging util module to redirect stdout into file log
- enable stdout logging in executor to write benchmark output into both stdout and file `sb-bench.log`
- enable real-time log flushing in run_command of microbenchmarks through config `log_flushing`

**Minor Revision**
- add log_n_step args to enable regular step time log in model benchmarks 
- udpate related docs

9dfefce3

29 Apr, 2022 1 commit

Release - SuperBench v0.5.0 (#350) · 6681c720

Yifan Xiong authored Apr 29, 2022



**Description**

Cherry-pick  bug fixes from v0.5.0 to main.

**Major Revisions**

* Bug - Force to fix ort version as '1.10.0' (#343)
* Bug - Support no matching rules and unify the output name in result_summary (#345)
* Analyzer - Support regex in annotations of benchmark naming for metrics in rules (#344)
* Bug - Fix bugs in sync results on root rank for e2e model benchmarks (#342)
* Bug - Fix bug of duration feature for model benchmarks in distributed mode (#347)
* Docs - Upgrade version and release note (#348)
Co-authored-by: Yuting Jiang <v-yutjiang@microsoft.com>

6681c720

20 Apr, 2021 1 commit
- Benchmarks: Add Benchmark - Add LSTM model benchmarks. (#60) · 2a7ab691
  guoshzhao authored Apr 20, 2021
```
* Benchmarks: Add Benchmark - Add LSTM model benchmarks.
```
  2a7ab691
16 Apr, 2021 1 commit
- Benchmarks: Add Benchmark - Add GPT2 model benchmark. (#57) · af567cf6
  guoshzhao authored Apr 16, 2021
```
* Benchmarks: Add Benchmark - Add GPT2 model benchmark.
```
  af567cf6
26 Mar, 2021 1 commit

Benchmarks: Add Benchmark - Add Pytorch BERT benchmarks, including bert-base... · 0972b223

guoshzhao authored Mar 26, 2021


Benchmarks: Add Benchmark - Add Pytorch BERT benchmarks, including bert-base and bert-large.   (#20)

* add pytorch bert benchmarks.

* revise code

* fix issue

* revise code.
Co-authored-by: Guoshuai Zhao <guzhao@microsoft.com>

0972b223