Unverified Commit 4eade17a authored by Shaden Smith's avatar Shaden Smith Committed by GitHub
Browse files

News edits (#220)

* BERT title
parent 0c824830
...@@ -22,7 +22,7 @@ establishing a new SOTA in the LM category. ...@@ -22,7 +22,7 @@ establishing a new SOTA in the LM category.
# News # News
* [2020/05/19] [An Order-of-Magnitude Larger and Faster Training with ZeRO-2](https://www.deepspeed.ai/news/2020/05/19/zero-stage2.html) * [2020/05/19] [An Order-of-Magnitude Larger and Faster Training with ZeRO-2](https://www.deepspeed.ai/news/2020/05/19/zero-stage2.html)
<span style="color:dodgerblue">**[_NEW_]**</span> <span style="color:dodgerblue">**[_NEW_]**</span>
* [2020/05/19] [DeepSpeed optimizes transformer kernels to achieve the world’s and most efficient fastest BERT training record: 44 minutes on 1024 NVIDIA V100 GPUs](https://www.deepspeed.ai/news/2020/05/19/bert-record.html) * [2020/05/19] [The Fastest and Most Efficient BERT Training through Optimized Transformer Kernels](https://www.deepspeed.ai/news/2020/05/19/bert-record.html)
<span style="color:dodgerblue">**[_NEW_]**</span> <span style="color:dodgerblue">**[_NEW_]**</span>
* [2020/02/13] [Turing-NLG: A 17-billion-parameter language model by Microsoft](https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/) * [2020/02/13] [Turing-NLG: A 17-billion-parameter language model by Microsoft](https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/)
* [2020/02/13] [ZeRO & DeepSpeed: New system optimizations enable training models with over 100 billion parameters](https://www.microsoft.com/en-us/research/blog/zero-deepspeed-new-system-optimizations-enable-training-models-with-over-100-billion-parameters/) * [2020/02/13] [ZeRO & DeepSpeed: New system optimizations enable training models with over 100 billion parameters](https://www.microsoft.com/en-us/research/blog/zero-deepspeed-new-system-optimizations-enable-training-models-with-over-100-billion-parameters/)
......
--- ---
layout: single layout: single
title: "DeepSpeed optimizes transformer kernels to achieve the world's fastest and most efficient BERT training record: 44 minutes on 1024 NVIDIA V100 GPUs" title: "The Fastest and Most Efficient BERT Training through Optimized Transformer Kernels"
excerpt: "" excerpt: ""
categories: news categories: news
new_post: true new_post: true
date: 2020-05-19 00:00:00 date: 2020-05-19 00:00:00
--- ---
We introduce new technology to accelerate single GPU performance via We introduce new technology to accelerate single GPU performance via kernel
kernel optimizations. These optimizations not only create a strong optimizations. These optimizations not only create a strong foundation for
foundation for scaling out large models, but also improve the single GPU scaling out large models, but also improve the single GPU performance of
performance of highly tuned and moderately sized models like BERT by more highly tuned and moderately sized models like BERT by more than 30%, reaching
than 30%, reaching a staggering performance of 66 teraflops per V100 GPU, a staggering performance of 66 teraflops per V100 GPU, which is 52% of the
which is 52% of the hardware peak. **Using these optimizations as the building hardware peak. **Using optimized transformer kernels as the building block,
block, DeepSpeed achieves the fastest BERT training record: 44 minutes on DeepSpeed achieves the fastest BERT training record: 44 minutes on 1,024
1,024 NVIDIA V100 GPUs**, compared with the best published result NVIDIA V100 GPUs**, compared with the best published result of 67 minutes on
of 67 minutes on the same number and generation of GPUs. the same number and generation of GPUs.
**Code and tutorials are coming soon!**
For a technical overview, see our [blog post](linklink). For a technical overview, see our [blog post](linklink).
**Code and tutorials are coming soon!**
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment