1. 25 Mar, 2026 1 commit
    • Aishwarya Tonpe's avatar
      Benchmark: Model benchmark - deterministic training support (#731) · 036c4712
      Aishwarya Tonpe authored
      
      
      Adds opt-in deterministic training mode to SuperBench's PyTorch model
      benchmarks. When enabled --enable-determinism. PyTorch deterministic
      algorithms are enforced, and per-step numerical fingerprints (loss,
      activation means) are recorded as metrics. These can be compared across
      runs using the existing sb result diagnosis pipeline to verify bit-exact
      reproducibility — useful for hardware validation and platform
      comparison.
       
      Flags added - 
      
      --enable-determinism
      --check-frequency: Number of steps after which you want the metrics to
      be recorded
      --deterministic-seed
      
      Changes - 
      
      Updated pytorch_base.py to handle deterministic settings, logging.
      Added a new example script: pytorch_deterministic_example.py
      Added a test file: test_pytorch_determinism_all.py to verify everything
      works as expected.
      
      Usage - 
      
      Step 1: Run 1 - Run with --enable-determinism and the necessary metrics
      will be recorded in the results-summary.jsonl file
      Step 2: Generate the baseline file from the Run 1 results using - sb
      result generate-baseline
      Step 3: Run 2 - Run with --enable-determinism and the necessary metrics
      will be recorded in the results-summary.jsonl file on a different
      machine (or the same machine)
      Step 4: Run diagnosis on the results generated from the 2 runs using the
      - sb result diagnosis command
      
      Note - 
      1. Make sure all the parameters are constant between the 2 runs 
      2. Running the diagnosis command requires the rules.yaml file
      
      ---------
      Co-authored-by: default avatarUbuntu <rdadmin@HPCPLTNODE0.n3kgq4m0lhoednrx3hxtad2nha.cdmx.internal.cloudapp.net>
      036c4712
  2. 29 Sep, 2025 1 commit
  3. 28 Nov, 2024 1 commit
    • pdr's avatar
      Benchmarks - Add LLaMA-2 Models (#668) · 249e21c1
      pdr authored
      Added llama benchmark - training and inference in accordance with the
      existing pytorch models implementation like gpt2, lstm etc.
      
      - added llama fp8 unit test for better code coverage, to reduce memory
      required
      - updated transformers version >= 4.28.0 for LLamaConfig
      - set tokenizers version <= 0.20.3 to avoid 0.20.4 version
      [issues](https://github.com/huggingface/tokenizers/issues/1691
      
      ) with
      py3.8
      - added llama2 to tensorrt
      - llama2 tests not added to test_tensorrt_inference_performance.py due
      to large memory requirement for worker gpu. tests validated separately
      on gh200
      
      ---------
      Co-authored-by: default avatardpatlolla <dpatlolla@microsoft.com>
      249e21c1
  4. 16 Jun, 2023 1 commit
  5. 25 Mar, 2023 1 commit
  6. 30 Dec, 2022 1 commit
    • Yuting Jiang's avatar
      Executor - Add stdout logging util module and enable real-time logging flushing in executor (#445) · 9dfefce3
      Yuting Jiang authored
      **Description**
      Add stdout logging util module and enable real-time logging flushing in executor
      
      **Major Revision**
      - Add stdout logging util module to redirect stdout into file log
      - enable stdout logging in executor to write benchmark output into both stdout and file `sb-bench.log`
      - enable real-time log flushing in run_command of microbenchmarks through config `log_flushing`
      
      **Minor Revision**
      - add log_n_step args to enable regular step time log in model benchmarks 
      - udpate related docs
      9dfefce3
  7. 29 Apr, 2022 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.5.0 (#350) · 6681c720
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick  bug fixes from v0.5.0 to main.
      
      **Major Revisions**
      
      * Bug - Force to fix ort version as '1.10.0' (#343)
      * Bug - Support no matching rules and unify the output name in result_summary (#345)
      * Analyzer - Support regex in annotations of benchmark naming for metrics in rules (#344)
      * Bug - Fix bugs in sync results on root rank for e2e model benchmarks (#342)
      * Bug - Fix bug of duration feature for model benchmarks in distributed mode (#347)
      * Docs - Upgrade version and release note (#348)
      Co-authored-by: default avatarYuting Jiang <v-yutjiang@microsoft.com>
      6681c720
  8. 20 Apr, 2021 1 commit
  9. 16 Apr, 2021 1 commit
  10. 26 Mar, 2021 1 commit