Commits · 47d4a79d5868a7173fc580a55e16c8486a6ce32f · tsoc / superbenchmark

18 Apr, 2026 1 commit

Benchmark: Model benchmark - deterministic training support (#731) (#2) · 47d4a79d

one authored Apr 18, 2026



Adds opt-in deterministic training mode to SuperBench's PyTorch model
benchmarks. When enabled --enable-determinism. PyTorch deterministic
algorithms are enforced, and per-step numerical fingerprints (loss,
activation means) are recorded as metrics. These can be compared across
runs using the existing sb result diagnosis pipeline to verify bit-exact
reproducibility — useful for hardware validation and platform
comparison.
 
Flags added - 

--enable-determinism
--check-frequency: Number of steps after which you want the metrics to
be recorded
--deterministic-seed

Changes - 

Updated pytorch_base.py to handle deterministic settings, logging.
Added a new example script: pytorch_deterministic_example.py
Added a test file: test_pytorch_determinism_all.py to verify everything
works as expected.

Usage - 

Step 1: Run 1 - Run with --enable-determinism and the necessary metrics
will be recorded in the results-summary.jsonl file
Step 2: Generate the baseline file from the Run 1 results using - sb
result generate-baseline
Step 3: Run 2 - Run with --enable-determinism and the necessary metrics
will be recorded in the results-summary.jsonl file on a different
machine (or the same machine)
Step 4: Run diagnosis on the results generated from the 2 runs using the
- sb result diagnosis command

Note - 
1. Make sure all the parameters are constant between the 2 runs 
2. Running the diagnosis command requires the rules.yaml file

---------
Co-authored-by: Aishwarya Tonpe <aishwarya.tonpe25@gmail.com>
Co-authored-by: Ubuntu <rdadmin@HPCPLTNODE0.n3kgq4m0lhoednrx3hxtad2nha.cdmx.internal.cloudapp.net>

47d4a79d

30 Jun, 2025 1 commit

Benchmarks: Add Mixture of Experts Model (#679) · 44e35cda

pdr authored Jun 30, 2025



Added MoE model using MixtralConfig. 

1. Added 8x7b and 8x22b variants 
2. Requires high VRAM as all experts are loaded in memory. Thus,
disabled training due to memory constraint on test worker.

---------
Co-authored-by: Hongtao Zhang <garyworkzht@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Hongtao Zhang <hongtaozhang@microsoft.com>

44e35cda

28 Nov, 2024 1 commit

Benchmarks - Add LLaMA-2 Models (#668) · 249e21c1

pdr authored Nov 27, 2024

Added llama benchmark - training and inference in accordance with the
existing pytorch models implementation like gpt2, lstm etc.

- added llama fp8 unit test for better code coverage, to reduce memory
required
- updated transformers version >= 4.28.0 for LLamaConfig
- set tokenizers version <= 0.20.3 to avoid 0.20.4 version
[issues](https://github.com/huggingface/tokenizers/issues/1691

) with
py3.8
- added llama2 to tensorrt
- llama2 tests not added to test_tensorrt_inference_performance.py due
to large memory requirement for worker gpu. tests validated separately
on gh200

---------
Co-authored-by: dpatlolla <dpatlolla@microsoft.com>

249e21c1

07 Dec, 2023 1 commit
- Benchmarks: Add benchmark: Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark (#582) · dd5a6329
  Yuting Jiang authored Dec 07, 2023
```
**Description**
Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark
```
  dd5a6329
28 Jan, 2023 1 commit

Release - SuperBench v0.7.0 (#468) · b07fda15

Yifan Xiong authored Jan 28, 2023



**Description**

Cherry-pick bug fixes from v0.7.0 to main.

**Major Revisions**

* Benchmarks - Fix missing include in FP8 benchmark (#460)
* Fix bug in TE BERT model (#461)
* Doc - Update benchmark doc (#465)
* Bug: Fix bug for incorrect datatype judgement in cublas-function
source code (#464)
* Support `sb deploy` without pulling image (#466)
* Docs - Upgrade version and release note (#467)
Co-authored-by: Russell J. Hewett <russell.j.hewett@gmail.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>

b07fda15

25 Jan, 2022 1 commit

Config - Update benchmark naming to support annotations (#284) · 7d7cd3dc

Yifan Xiong authored Jan 25, 2022

__Description__

Update benchmark naming to support annotations.

__Major Revisions__
- Update name for `create_benchmark_context` in executor.
- Backward compatibility for model benchmarks using "_models" suffix.
- Update documents.

7d7cd3dc

19 Jan, 2022 1 commit
- Benchmarks: Add Feature - Add percentile metrics for ort and pytorch inference benchmarks (#283) · fd2bc9e0
  guoshzhao authored Jan 19, 2022
```
**Description**
Add 50th, 90th, 95th, 99th, 99.9th latency metrics for ORT and pytorch inference benchmarks.
```
  fd2bc9e0
09 Dec, 2021 1 commit
- Benchmarks: Unify metric names of benchmarks (#252) · 9f56b219
  Yuting Jiang authored Dec 09, 2021
```
**Description**
Unify metric names of benchmarks.
```
  9f56b219
27 Oct, 2021 1 commit
- Docs - Add introduction and metrics in benchmarks docs (#233) · 976803f8
  Yifan Xiong authored Oct 27, 2021
```
Add introduction and metrics for micro-benchmarks and model-benchmarks document.
```
  976803f8
12 Oct, 2021 1 commit

Docs - Refine document structure (#225) · 3d0fde12

Yifan Xiong authored Oct 12, 2021

__Major Revisions__

* Refine document structure for user tutorial.

__Minor Revisions__

* Add AMD part in installation.
* Change default config file to latest link.

3d0fde12

30 Jun, 2021 1 commit
- Docs - Release Note and Introduction (#107) · 2710fad5
  TobeyQin authored Jun 30, 2021
```
* Add introduction and release documents.
* Fix some typos in documents.
```
  2710fad5
25 Jun, 2021 1 commit
- Docs - Update SuperBench documents (#101) · 832e392f
  Yifan Xiong authored Jun 25, 2021
```
Update SuperBench documents.
```
  832e392f