FairScale is a PyTorch extension library for high performance and large scale training.
This library extends basic PyTorch capabilities while adding new SOTA scaling techniques.
FairScale makes available the latest distributed training techniques in the form of composable
modules and easy to use APIs. These APIs are a fundamental part of a researcher's toolbox as
they attempt to scale models with limited resources.
FairScale was designed with the following values in mind:
***Usability** - Users should be able to understand and use FairScale APIs with minimum cognitive overload.
***Modularity** - Users should be able to combine multiple FairScale APIs as part of their training loop seamlessly.
***Performance** - FairScale APIs provide the best performance in terms of scaling and efficiency.
## What's New:
* November 2021 [fairscale 0.4.3 was released](https://github.com/facebookresearch/fairscale/releases/tag/v0.4.3).
* We have an experimental layer that fuses multiple layers together to support large vocab size trainings.
* November 2021 [fairscale 0.4.2 was released](https://github.com/facebookresearch/fairscale/releases/tag/v0.4.2).
* We have a new experimental API called the LayerwiseMemoryTracker to help track, visualize and suggest fixes for memory issues occurring during the forward/backward pass of your models.
* Introducing SlowMoDistributedDataParallel API, a distributed training wrapper that is useful on clusters with slow network interconnects (e.g. Ethernet).
* September 2021 [`master` branch renamed to `main`](https://github.com/github/renaming).
## Installation
To install FairScale, please see the following [instructions](https://github.com/facebookresearch/fairscale/blob/main/docs/source/installation_instructions.rst). You should be able to install a pip package or
build directly from source.
## Getting Started
The full [documentation](https://fairscale.readthedocs.io/) contains instructions for getting started, deep dives and tutorials about the various FairScale APIs.
## Examples
Here are a few sample snippets from a subset of FairScale offerings:
### Pipe
Run a 4-layer model on 2 GPUs. The first two layers run on cuda:0 and the next two layers run on cuda:1.
See a more complete example [here](https://github.com/facebookresearch/fairscale/blob/main/benchmarks/oss.py), but a minimal example could look like the following :
We use circleci to test FairScale with the following PyTorch versions (with CUDA 11.2):
* the latest stable release (1.10)
* the latest LTS release (1.8)
* a recent nightly release (1.11.0.dev20211101+cu111)
Please create an [issue](https://github.com/facebookresearch/fairscale/issues) if you are having trouble with installation.
## Contributors
We welcome outside contributions! Please see the [CONTRIBUTING](CONTRIBUTING.md) instructions for how you can contribute to FairScale.
## License
FairScale is licensed under the [BSD-3-Clause License](LICENSE).
fairscale.nn.pipe is forked from [torchgpipe](https://github.com/kakaobrain/torchgpipe), Copyright 2019, Kakao Brain, licensed under [Apache License](http://www.apache.org/licenses/LICENSE-2.0).
fairscale.nn.model_parallel is forked from [Megatron-LM](https://github.com/NVIDIA/Megatron-LM), Copyright 2020, NVIDIA CORPORATION, licensed under [Apache License](http://www.apache.org/licenses/LICENSE-2.0).
fairscale.optim.adascale is forked from [AdaptDL](https://github.com/petuum/adaptdl), Copyright 2020, Petuum, Inc., licensed under [Apache License](http://www.apache.org/licenses/LICENSE-2.0).
fairscale.nn.misc.flatten_params_wrapper is forked from [PyTorch-Reparam-Module](https://github.com/SsnL/PyTorch-Reparam-Module), Copyright 2018, Tongzhou Wang, licensed under [MIT License](https://github.com/SsnL/PyTorch-Reparam-Module/blob/master/LICENSE).
If you use FairScale in your publication, please cite it by using the following BibTeX entry.
- 安装 `intel-mkl`库
```BibTeX
@Misc{FairScale2021,
author = {Mandeep Baines and Shruti Bhosale and Vittorio Caggiano and Naman Goyal and Siddharth Goyal and Myle Ott and Benjamin Lefaudeux and Vitaliy Liptchinsky and Mike Rabbat and Sam Sheiffer and Anjali Sridhar and Min Xu},
title = {FairScale: A general purpose modular PyTorch library for high performance and large scale training},
1. If you experience an error indicating a default branch does not exist, it probably due to the latest update, switching the default branch from "master" to "main"
```
error: pathspec 'non-existing-branch' did not match any file(s) known to git
```
Please run the following commands to update to the main branch.
FairScale is a PyTorch extension library for high performance and large scale training.
This library extends basic PyTorch capabilities while adding new SOTA scaling techniques.
FairScale makes available the latest distributed training techniques in the form of composable
modules and easy to use APIs. These APIs are a fundamental part of a researcher's toolbox as
they attempt to scale models with limited resources.
FairScale was designed with the following values in mind:
***Usability** - Users should be able to understand and use FairScale APIs with minimum cognitive overload.
***Modularity** - Users should be able to combine multiple FairScale APIs as part of their training loop seamlessly.
***Performance** - FairScale APIs provide the best performance in terms of scaling and efficiency.
## What's New:
* November 2021 [fairscale 0.4.3 was released](https://github.com/facebookresearch/fairscale/releases/tag/v0.4.3).
* We have an experimental layer that fuses multiple layers together to support large vocab size trainings.
* November 2021 [fairscale 0.4.2 was released](https://github.com/facebookresearch/fairscale/releases/tag/v0.4.2).
* We have a new experimental API called the LayerwiseMemoryTracker to help track, visualize and suggest fixes for memory issues occurring during the forward/backward pass of your models.
* Introducing SlowMoDistributedDataParallel API, a distributed training wrapper that is useful on clusters with slow network interconnects (e.g. Ethernet).
* September 2021 [`master` branch renamed to `main`](https://github.com/github/renaming).
## Installation
To install FairScale, please see the following [instructions](https://github.com/facebookresearch/fairscale/blob/main/docs/source/installation_instructions.rst). You should be able to install a pip package or
build directly from source.
## Getting Started
The full [documentation](https://fairscale.readthedocs.io/) contains instructions for getting started, deep dives and tutorials about the various FairScale APIs.
## Examples
Here are a few sample snippets from a subset of FairScale offerings:
### Pipe
Run a 4-layer model on 2 GPUs. The first two layers run on cuda:0 and the next two layers run on cuda:1.
See a more complete example [here](https://github.com/facebookresearch/fairscale/blob/main/benchmarks/oss.py), but a minimal example could look like the following :
We use circleci to test FairScale with the following PyTorch versions (with CUDA 11.2):
* the latest stable release (1.10)
* the latest LTS release (1.8)
* a recent nightly release (1.11.0.dev20211101+cu111)
Please create an [issue](https://github.com/facebookresearch/fairscale/issues) if you are having trouble with installation.
## Contributors
We welcome outside contributions! Please see the [CONTRIBUTING](CONTRIBUTING.md) instructions for how you can contribute to FairScale.
## License
FairScale is licensed under the [BSD-3-Clause License](LICENSE).
fairscale.nn.pipe is forked from [torchgpipe](https://github.com/kakaobrain/torchgpipe), Copyright 2019, Kakao Brain, licensed under [Apache License](http://www.apache.org/licenses/LICENSE-2.0).
fairscale.nn.model_parallel is forked from [Megatron-LM](https://github.com/NVIDIA/Megatron-LM), Copyright 2020, NVIDIA CORPORATION, licensed under [Apache License](http://www.apache.org/licenses/LICENSE-2.0).
fairscale.optim.adascale is forked from [AdaptDL](https://github.com/petuum/adaptdl), Copyright 2020, Petuum, Inc., licensed under [Apache License](http://www.apache.org/licenses/LICENSE-2.0).
fairscale.nn.misc.flatten_params_wrapper is forked from [PyTorch-Reparam-Module](https://github.com/SsnL/PyTorch-Reparam-Module), Copyright 2018, Tongzhou Wang, licensed under [MIT License](https://github.com/SsnL/PyTorch-Reparam-Module/blob/master/LICENSE).
## Citing FairScale
If you use FairScale in your publication, please cite it by using the following BibTeX entry.
```BibTeX
@Misc{FairScale2021,
author = {Mandeep Baines and Shruti Bhosale and Vittorio Caggiano and Naman Goyal and Siddharth Goyal and Myle Ott and Benjamin Lefaudeux and Vitaliy Liptchinsky and Mike Rabbat and Sam Sheiffer and Anjali Sridhar and Min Xu},
title = {FairScale: A general purpose modular PyTorch library for high performance and large scale training},
1. If you experience an error indicating a default branch does not exist, it probably due to the latest update, switching the default branch from "master" to "main"
```
error: pathspec 'non-existing-branch' did not match any file(s) known to git
```
Please run the following commands to update to the main branch.