"tests/vscode:/vscode.git/clone" did not exist on "a4c931a9bf182b78bdffee7b71142122eceafe75"
Unverified Commit 1230e31b authored by Shaden Smith's avatar Shaden Smith Committed by GitHub
Browse files

Adds links to new blog. (#221)

parent 4eade17a
...@@ -20,6 +20,9 @@ establishing a new SOTA in the LM category. ...@@ -20,6 +20,9 @@ establishing a new SOTA in the LM category.
# News # News
* [2020/05/19] [ZeRO-2 & DeepSpeed: Shattering Barriers of Deep Learning Speed & Scale](https://www.microsoft.com/en-us/research/blog/zero-2-deepspeed-shattering-barriers-of-deep-learning-speed-scale/)
<span style="color:dodgerblue">**[_NEW_]**</span>
* [2020/05/19] [An Order-of-Magnitude Larger and Faster Training with ZeRO-2](https://www.deepspeed.ai/news/2020/05/19/zero-stage2.html) * [2020/05/19] [An Order-of-Magnitude Larger and Faster Training with ZeRO-2](https://www.deepspeed.ai/news/2020/05/19/zero-stage2.html)
<span style="color:dodgerblue">**[_NEW_]**</span> <span style="color:dodgerblue">**[_NEW_]**</span>
* [2020/05/19] [The Fastest and Most Efficient BERT Training through Optimized Transformer Kernels](https://www.deepspeed.ai/news/2020/05/19/bert-record.html) * [2020/05/19] [The Fastest and Most Efficient BERT Training through Optimized Transformer Kernels](https://www.deepspeed.ai/news/2020/05/19/bert-record.html)
......
...@@ -17,6 +17,6 @@ DeepSpeed achieves the fastest BERT training record: 44 minutes on 1,024 ...@@ -17,6 +17,6 @@ DeepSpeed achieves the fastest BERT training record: 44 minutes on 1,024
NVIDIA V100 GPUs**, compared with the best published result of 67 minutes on NVIDIA V100 GPUs**, compared with the best published result of 67 minutes on
the same number and generation of GPUs. the same number and generation of GPUs.
For a technical overview, see our [blog post](linklink). For a technical overview, see our [blog post](https://www.microsoft.com/en-us/research/blog/zero-2-deepspeed-shattering-barriers-of-deep-learning-speed-scale/).
**Code and tutorials are coming soon!** **Code and tutorials are coming soon!**
---
layout: single
title: "ZeRO-2 & DeepSpeed: Shattering Barriers of Deep Learning Speed & Scale"
excerpt: ""
link: https://www.microsoft.com/en-us/research/blog/zero-2-deepspeed-shattering-barriers-of-deep-learning-speed-scale/
categories: news
new_post: true
date: 2020-05-19 02:00:00
---
...@@ -17,7 +17,7 @@ learning training by an order of magnitude. More concretely, ZeRO-2 allows ...@@ -17,7 +17,7 @@ learning training by an order of magnitude. More concretely, ZeRO-2 allows
training models as large as 170 billion parameters up to 10x faster compared training models as large as 170 billion parameters up to 10x faster compared
to state of the art. to state of the art.
For more information on ZeRO-2 overview, see our [blog post](linklink). For more information on ZeRO-2, see our [blog post](https://www.microsoft.com/en-us/research/blog/zero-2-deepspeed-shattering-barriers-of-deep-learning-speed-scale/).
For more information on how to use ZeRO-2, see an example of training GPT family of models in this [tutorial](/tutorials/megatron/). For more information on how to use ZeRO-2, see an example of training GPT family of models in this [tutorial](/tutorials/megatron/).
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment