- 30 Jun, 2025 1 commit
-
-
pdr authored
Added MoE model using MixtralConfig. 1. Added 8x7b and 8x22b variants 2. Requires high VRAM as all experts are loaded in memory. Thus, disabled training due to memory constraint on test worker. --------- Co-authored-by:
Hongtao Zhang <garyworkzht@gmail.com> Co-authored-by:
Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by:
Hongtao Zhang <hongtaozhang@microsoft.com>
-
- 28 Nov, 2024 1 commit
-
-
pdr authored
Added llama benchmark - training and inference in accordance with the existing pytorch models implementation like gpt2, lstm etc. - added llama fp8 unit test for better code coverage, to reduce memory required - updated transformers version >= 4.28.0 for LLamaConfig - set tokenizers version <= 0.20.3 to avoid 0.20.4 version [issues](https://github.com/huggingface/tokenizers/issues/1691 ) with py3.8 - added llama2 to tensorrt - llama2 tests not added to test_tensorrt_inference_performance.py due to large memory requirement for worker gpu. tests validated separately on gh200 --------- Co-authored-by:
dpatlolla <dpatlolla@microsoft.com>
-
- 07 Dec, 2023 1 commit
-
-
Yuting Jiang authored
**Description** Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark
-
- 20 Apr, 2021 2 commits
- 16 Apr, 2021 1 commit
-
-
guoshzhao authored
* Benchmarks: Add Benchmark - Add GPT2 model benchmark.
-
- 26 Mar, 2021 1 commit
-
-
guoshzhao authored
Benchmarks: Add Benchmark - Add Pytorch BERT benchmarks, including bert-base and bert-large. (#20) * add pytorch bert benchmarks. * revise code * fix issue * revise code. Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 22 Mar, 2021 1 commit
-
-
guoshzhao authored
* move benchmarks registration from registry.py to __init__.py * revise __init__. Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-
- 24 Feb, 2021 1 commit
-
-
guoshzhao authored
* benchmarks init. Co-authored-by:Guoshuai Zhao <guzhao@microsoft.com>
-