Commits · b89365e65702ee5bd51f8138bfa6246cda9b6b68 · OpenDAS / fairscale

24 Feb, 2021 1 commit
- split benchmark configs (#420) · b89365e6
  anj-s authored Feb 23, 2021
  
  b89365e6
04 Feb, 2021 1 commit
- [refactor] multiprocess_pipe: remove pipelined_backward (#362) · 42e44149
  msbaines authored Feb 04, 2021
  
  42e44149
03 Feb, 2021 1 commit

[refactor] Refactor and enable multiprocess nn.Pipe benchmarks. (#319) · cd186441

anj-s authored Feb 03, 2021



* mp cleanup

* round of multiprocess refactoring

* test golden run

* print cuda stats

* fix lint errors

* enable multiprocess pipe benchmarks

* set world size to be available gpus

* more changes

* use synthetic loaders for intermediate pipeline stages

* merged master

* fix for the devices property

* dataloader fix

* modify rank check

* print wps stats

* enable verification

* fix logging

* fix flag name

* fix flag name

* check for rank

* fix indent

* pass args

* pass args

* modify golden data

* remove unused print messsage

* fix lint errors

* add comments

* fix benchmarks
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

cd186441

29 Jan, 2021 1 commit
- [refactor] make AsyncPipe its own class (#341) · eaee5976
  msbaines authored Jan 29, 2021
  
  eaee5976
27 Jan, 2021 1 commit
- [refactor] pipe: separate out Single and MultiProcess pipe (#326) · cae9b638
  msbaines authored Jan 26, 2021
  
  cae9b638
21 Jan, 2021 1 commit

[refactor] Add batch size to the golden benchmark configs. (#313) · 81841734

anj-s authored Jan 21, 2021



* [refactor]Remove unused variables and refactor common configurations

* move helper function to call site

* fixed lint errors

* fix lint errors

* fix lint errors

* fix lint errors

* fix import order

* format files

* remove unused imports

* fix lint errors

* fix lint errors

* refactor common utilities

* address PR comments

* sorted imports

* add space

* modify comment

* added doc strings and addressed PR comments.

* addressed PR comments

* added another comment to clarify.

* fixing lint errors

* addressed PR comments

* addressed PR comments

* fixed typos

* initialize var

* rename seq_pred to lm

* fix lint errors

* move datasets and models into separate folders

* add the folders created

* fix lint errors

* create golden config to stats mapping

* add common batching for both synthetic and real data

* fixed lint errors

* enable real pipe benchmakrs with new golden data

* reduce seq len to avoid OOM

* updated golden data

* add logging

* add golden data

* add golden data

* fix lint errors

* add doc string

* remove unused class

* add seq len and batch size to the config

* remove commented out line

* address comments

* rename imports

* refactor common logic in dataloaders

* add golden configs

* lint changes

* merge latest changes

* lint errors

* address PR comments
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

81841734

19 Jan, 2021 1 commit

[refactor] Enable benchmarks/pipe.py and merge real and synthetic input pipeline. (#286) · 44b9bcd8

anj-s authored Jan 19, 2021



* [refactor]Remove unused variables and refactor common configurations

* move helper function to call site

* fixed lint errors

* fix lint errors

* fix lint errors

* fix lint errors

* fix import order

* format files

* remove unused imports

* fix lint errors

* fix lint errors

* refactor common utilities

* address PR comments

* sorted imports

* add space

* modify comment

* added doc strings and addressed PR comments.

* addressed PR comments

* added another comment to clarify.

* fixing lint errors

* addressed PR comments

* addressed PR comments

* fixed typos

* initialize var

* rename seq_pred to lm

* fix lint errors

* move datasets and models into separate folders

* add the folders created

* fix lint errors

* create golden config to stats mapping

* add common batching for both synthetic and real data

* fixed lint errors

* enable real pipe benchmakrs with new golden data

* reduce seq len to avoid OOM

* updated golden data

* add logging

* add golden data

* add golden data

* fix lint errors

* add doc string

* remove commented out line

* address comments

* rename imports

* refactor common logic in dataloaders

* add golden configs

* lint changes
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

44b9bcd8

04 Jan, 2021 1 commit

[refactor] Modify train and benchmark functions to account for multiple models and datasets. (#260) · 656fc319

anj-s authored Jan 04, 2021



* [refactor]Remove unused variables and refactor common configurations

* move helper function to call site

* fixed lint errors

* fix lint errors

* fix lint errors

* fix lint errors

* fix import order

* format files

* remove unused imports

* fix lint errors

* fix lint errors

* refactor common utilities

* address PR comments

* sorted imports

* add space

* modify comment

* added doc strings and addressed PR comments.

* addressed PR comments

* added another comment to clarify.

* fixing lint errors

* addressed PR comments

* addressed PR comments

* fixed typos

* initialize var

* rename seq_pred to lm

* fix lint errors
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

656fc319

30 Dec, 2020 1 commit

[refactor] Remove unused variables, add configuration objects and basic... · 3c727ec5

anj-s authored Dec 29, 2020


[refactor] Remove unused variables, add configuration objects and basic cleanup for pipe benchmarks. (#252)

* [refactor]Remove unused variables and refactor common configurations

* move helper function to call site

* fixed lint errors

* fix lint errors

* fix lint errors

* fix lint errors

* fix import order

* format files

* remove unused imports

* fix lint errors

* address PR comments

* sorted imports

* add space

* modify comment

* added doc strings and addressed PR comments.

* addressed PR comments

* added another comment to clarify.

* fixing lint errors

* rename variable
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

3c727ec5

16 Dec, 2020 1 commit

[feat] add CPU support to tutorials in examples + factorize tutorials (#247) · 02478eb3

jessijzhao authored Dec 15, 2020

* [feat] add CPU support to tutorials in examples

* now works on a machine without cuda
* fixes some minor typos

* [cleanup] factorize tutorials in examples

* collects duplicate code across tutorials in helpers.py

* [fix] getData in tutorials now returns iterable

02478eb3

01 Dec, 2020 1 commit
- [chore] Refactor unit testing, shared utils (#218) · e83da060
  Benjamin Lefaudeux authored Dec 01, 2020
  
  e83da060
19 Nov, 2020 2 commits
- [fix] Reverting a change which slipped in #188 (#198) · ba367d39
  Benjamin Lefaudeux authored Nov 18, 2020
```
* reverting a change which slipped in #188
```
  ba367d39
- [feat] Add CPU support for pipe.py benchmarks (#188) · a842a927
  Yuanyuan (Ana) Shen authored Nov 18, 2020
```
* Add CPU support for pipe.py benchmarks, CUDA-free
```
  a842a927
10 Nov, 2020 1 commit

Single-process control via PipeRPCWrapper (#156) · 5d4f50fb

Tom Birch authored Nov 10, 2020

Adds support for:
* Reused layers (e.g. for weight sharing)
* Lazily-constructed layers
* Single-process control via PipeRPCWrapper
* PipelineStyle.AsyncScheudle, which lays the foundation for asynchronous pipeline work by introducing an event loop for each rank/worker to process either activations or gradients as they arrive

Also added examples for multi-process and PipeRPCWrapper

5d4f50fb

28 Oct, 2020 1 commit
- [chore] update isort to 5.6.4 (#170) · ea9876e3
  msbaines authored Oct 27, 2020
  
  ea9876e3
17 Sep, 2020 1 commit

Multi-process pipe (#90) · 63f7796a

Tom Birch authored Sep 17, 2020

Adds support for distributing pipeline stages across multiple processes (and therefore multiple machines)
* Adds a style argument to the Pipe constructor, defaulting to PipelineStyle.SingleProcess, but also supporting PipelineStyle.MultiProcess
* Added support for lazy construction of modules (see lazy_construction for an example)
* Added two implementations of inter-process communication: one based on rpc with globally visible queues, one based on send/recv
* Copied all the relevant tests from tests/pipe to tests/pipe_process and modified them to exercise PipelineStyle.MultiProcess

63f7796a

03 Sep, 2020 1 commit

Add grad scaler (#48) · b6a5e634

Jun Ru Anderson authored Sep 03, 2020



Add GradScaler to Fairscale, subclassing PyTorch's GradScaler. Use GradScaler in the pipe benchmark; though it is not needed in this case, it is a good example of how to use gradient scaling for larger models that do require gradient scaling in order to converge.
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

b6a5e634

28 Aug, 2020 1 commit

[test] specify chunks for pipe/transformer benchmark (#52) · d1d74413

Jun Ru Anderson authored Aug 28, 2020



* specify chunks for pipe/transformer benchmark

Set chunks to be equal to len(balance) for pipe/transformer benchmark. Will update words per second and memory usage checks in next commit (must test on CircleCI to find appropriate values)

* change benchmark words per second and memory usage

Did six runs for words-per-second, with results: 9144.40, 9163.91, 9993.01, 9082.82, 9155.09, 9000.67
Peak allocated bytes per device (which does not change between runs) were 193206272, 645632, 562688, 92688384 for devices 0, 1, 2 and 3, respectively

* increase batch size

batch size was small enough that the GPU's computing power was not the bottleneck, slowing training and specifically making more chunks slower. Increasing batch size has therefore increased training speed

* update benchmark numbers

ran six times, with wps 36917.44, 36797.65, 37006.03, 36872.84, 37129.31, 37003.31 and peak allocated bytes 4061909504, 4050944, 10427392, 2031824896 for devices 0,1,2 and 3 respectively.
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

d1d74413

22 Aug, 2020 1 commit

[feat] optimizer state scaling (#44) · 5251a69a

Jun Ru Anderson authored Aug 21, 2020



Implement scaling of optimizer state when using pure-fp16 training to avoid underflow. Update benchmark to use pure-fp16. Modify state_dict methods to store and load the optimizer state scale.
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

5251a69a

21 Aug, 2020 1 commit

[test] set torch seed for Adam tests (#49) · 0e8c2a96

Jun Ru Anderson authored Aug 21, 2020



Set the torch seed for tests. xfail mixed precision and memory-efficient mixed-precision state_dict tests due to their states being cast to FP16 and back to FP32 during load_state_dict.
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

0e8c2a96

18 Aug, 2020 1 commit

[feat] allow fp16 optimizer state with Adam (#41) · 8ee5a8ff

Jun Ru Anderson authored Aug 18, 2020



Allow training with optimizer state in fp16. Use an enum to select from full-precision, mixed precision, memory efficient mixed precision and pure fp16. Improve clarity of testing code
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

8ee5a8ff

14 Aug, 2020 1 commit

[feat] add mixed precision Adam (#40) · e2d8f573

Jun Ru Anderson authored Aug 14, 2020



Add support for mixed-precision (half precision params, full precision gradients) and memory-efficient (half precision params and half precision gradients) training with Adam
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>

e2d8f573

31 Jul, 2020 3 commits
- [feat] add FusedAdam (#10) · bfba68d8
  Jun Ru Anderson authored Jul 30, 2020
```
Add FusedAdam, update benchmark and add tests.
Co-authored-by: Jun Ru Anderson <andersonic@fb.com>
```
  bfba68d8
- [test] switch benchmark to use Adam rather than SGD · e4b2ffd4
  Jun Ru Anderson authored Jul 17, 2020
  
  e4b2ffd4
- [feat] add Transformer gpipe benchmark · 74181b08
  Jun Ru Anderson authored Jul 17, 2020
  
  74181b08
08 Jul, 2020 1 commit
- Initial commit · 0cd65242
  Mandeep Singh Baines authored Jul 07, 2020
  
  0cd65242