2020-09-09-ZeRO-Offload.md 1.06 KB
Newer Older
Minjia Zhang's avatar
Minjia Zhang committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
---
layout: single
title: "10x bigger model training on a single GPU with ZeRO-Offload"
excerpt: ""
categories: news
new_post: true
date: 2020-09-09 00:00:00
---

We introduce a new technology called ZeRO-Offload to enable **10X bigger model training on a single GPU**. ZeRO-Offload extends ZeRO-2 to leverage both CPU and GPU memory for training large models. Using a machine with **a single GPU**, our users now can run **models of up to 13 billion parameters** without running out of memory, 10x bigger than the existing approaches, while obtaining competitive throughput. This feature democratizes multi-billion-parameter model training and opens the window for many deep learning practitioners to explore bigger and better models. 

* For more information on ZeRO-Offload, see our [press release]( {{ site.press_release_v3 }} ).
* For more information on how to use ZeRO-Offload, see our [ZeRO-Offload tutorial](https://www.deepspeed.ai/tutorials/zero-offload/).
* The source code for ZeRO-Offload can be found in the [DeepSpeed repo](https://github.com/microsoft/deepspeed).