Commits · 7ee228bf68fe8624da596860949ceb5ebc1b1dfe · OpenDAS / fairscale

24 Feb, 2021 2 commits

[refactor] Modify folder locations for tests/ to mirror source code tree. (#419) · 3b0717eb

anj-s authored Feb 24, 2021



* refactor experimental file locations

* refactor fix

* disable test temporarily

* lint error fix

* make the change in the right file

* fix lint errors

* skip failing tests
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

3b0717eb

split benchmark configs (#420) · b89365e6
anj-s authored Feb 23, 2021

b89365e6

23 Feb, 2021 1 commit

[refactor] Move experimental folder to the fairscale repo (#410) · 045a9743

anj-s authored Feb 22, 2021



* move experimental to the fairscale repo

* lint error fixes

* modify test imports

* lint error fixes

* lint errors
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

045a9743

04 Feb, 2021 1 commit
- [refactor] multiprocess_pipe: remove pipelined_backward (#362) · 42e44149
  msbaines authored Feb 04, 2021
  
  42e44149
03 Feb, 2021 2 commits

[feat][minor] OSS Benchmark - regression test + background testing new optims (#352) · de713d1e
Benjamin Lefaudeux authored Feb 03, 2021
```
* restoring the regression test, adding a test of the for_each optims
* fix the regression test on circleci
* removing unused flags
```
de713d1e

[refactor] Refactor and enable multiprocess nn.Pipe benchmarks. (#319) · cd186441

anj-s authored Feb 03, 2021



* mp cleanup

* round of multiprocess refactoring

* test golden run

* print cuda stats

* fix lint errors

* enable multiprocess pipe benchmarks

* set world size to be available gpus

* more changes

* use synthetic loaders for intermediate pipeline stages

* merged master

* fix for the devices property

* dataloader fix

* modify rank check

* print wps stats

* enable verification

* fix logging

* fix flag name

* fix flag name

* check for rank

* fix indent

* pass args

* pass args

* modify golden data

* remove unused print messsage

* fix lint errors

* add comments

* fix benchmarks
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

cd186441

29 Jan, 2021 1 commit
- [refactor] make AsyncPipe its own class (#341) · eaee5976
  msbaines authored Jan 29, 2021
  
  eaee5976
27 Jan, 2021 1 commit
- [refactor] pipe: separate out Single and MultiProcess pipe (#326) · cae9b638
  msbaines authored Jan 26, 2021
  
  cae9b638
25 Jan, 2021 1 commit

[refactor] Add benchmark config object and validation function (#314) · 331aed2c

anj-s authored Jan 25, 2021



* [refactor]Remove unused variables and refactor common configurations

* move helper function to call site

* fixed lint errors

* fix lint errors

* fix lint errors

* fix lint errors

* fix import order

* format files

* remove unused imports

* fix lint errors

* fix lint errors

* refactor common utilities

* address PR comments

* sorted imports

* add space

* modify comment

* added doc strings and addressed PR comments.

* addressed PR comments

* added another comment to clarify.

* fixing lint errors

* addressed PR comments

* addressed PR comments

* fixed typos

* initialize var

* rename seq_pred to lm

* fix lint errors

* move datasets and models into separate folders

* add the folders created

* fix lint errors

* create golden config to stats mapping

* add common batching for both synthetic and real data

* fixed lint errors

* enable real pipe benchmakrs with new golden data

* reduce seq len to avoid OOM

* updated golden data

* add logging

* add golden data

* add golden data

* fix lint errors

* add doc string

* remove unused class

* add seq len and batch size to the config

* remove commented out line

* address comments

* rename imports

* refactor common logic in dataloaders

* add golden configs

* lint changes

* merge latest changes

* lint errors

* address PR comments

* initial refactoring

* lint fixes

* fix lint errors

* update comment
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

331aed2c

23 Jan, 2021 1 commit

[feat] Add AMPnet implementation in experimental dir (#304) · 14491030

Siddharth Goyal authored Jan 22, 2021

* Add AMPnet implementation (clean version)

* Move ampnet to experimental

* Move stuff around pipeline

* Address review comments and fix pre-commit errors

* Refactor and modify delegate functionality

* Modify header in pipe.py

14491030

21 Jan, 2021 2 commits

[feat] Enabling ViT in OSS benchmarks (#322) · 8a49a748
Benjamin Lefaudeux authored Jan 21, 2021

8a49a748

[refactor] Add batch size to the golden benchmark configs. (#313) · 81841734

anj-s authored Jan 21, 2021



* [refactor]Remove unused variables and refactor common configurations

* move helper function to call site

* fixed lint errors

* fix lint errors

* fix lint errors

* fix lint errors

* fix import order

* format files

* remove unused imports

* fix lint errors

* fix lint errors

* refactor common utilities

* address PR comments

* sorted imports

* add space

* modify comment

* added doc strings and addressed PR comments.

* addressed PR comments

* added another comment to clarify.

* fixing lint errors

* addressed PR comments

* addressed PR comments

* fixed typos

* initialize var

* rename seq_pred to lm

* fix lint errors

* move datasets and models into separate folders

* add the folders created

* fix lint errors

* create golden config to stats mapping

* add common batching for both synthetic and real data

* fixed lint errors

* enable real pipe benchmakrs with new golden data

* reduce seq len to avoid OOM

* updated golden data

* add logging

* add golden data

* add golden data

* fix lint errors

* add doc string

* remove unused class

* add seq len and batch size to the config

* remove commented out line

* address comments

* rename imports

* refactor common logic in dataloaders

* add golden configs

* lint changes

* merge latest changes

* lint errors

* address PR comments
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

81841734

19 Jan, 2021 1 commit

[refactor] Enable benchmarks/pipe.py and merge real and synthetic input pipeline. (#286) · 44b9bcd8

anj-s authored Jan 19, 2021



* [refactor]Remove unused variables and refactor common configurations

* move helper function to call site

* fixed lint errors

* fix lint errors

* fix lint errors

* fix lint errors

* fix import order

* format files

* remove unused imports

* fix lint errors

* fix lint errors

* refactor common utilities

* address PR comments

* sorted imports

* add space

* modify comment

* added doc strings and addressed PR comments.

* addressed PR comments

* added another comment to clarify.

* fixing lint errors

* addressed PR comments

* addressed PR comments

* fixed typos

* initialize var

* rename seq_pred to lm

* fix lint errors

* move datasets and models into separate folders

* add the folders created

* fix lint errors

* create golden config to stats mapping

* add common batching for both synthetic and real data

* fixed lint errors

* enable real pipe benchmakrs with new golden data

* reduce seq len to avoid OOM

* updated golden data

* add logging

* add golden data

* add golden data

* fix lint errors

* add doc string

* remove commented out line

* address comments

* rename imports

* refactor common logic in dataloaders

* add golden configs

* lint changes
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

44b9bcd8

04 Jan, 2021 1 commit

[refactor] Modify train and benchmark functions to account for multiple models and datasets. (#260) · 656fc319

anj-s authored Jan 04, 2021



* [refactor]Remove unused variables and refactor common configurations

* move helper function to call site

* fixed lint errors

* fix lint errors

* fix lint errors

* fix lint errors

* fix import order

* format files

* remove unused imports

* fix lint errors

* fix lint errors

* refactor common utilities

* address PR comments

* sorted imports

* add space

* modify comment

* added doc strings and addressed PR comments.

* addressed PR comments

* added another comment to clarify.

* fixing lint errors

* addressed PR comments

* addressed PR comments

* fixed typos

* initialize var

* rename seq_pred to lm

* fix lint errors
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

656fc319

30 Dec, 2020 2 commits

[refactor] Remove unused variables, add configuration objects and basic... · 3c727ec5

anj-s authored Dec 29, 2020


[refactor] Remove unused variables, add configuration objects and basic cleanup for pipe benchmarks. (#252)

* [refactor]Remove unused variables and refactor common configurations

* move helper function to call site

* fixed lint errors

* fix lint errors

* fix lint errors

* fix lint errors

* fix import order

* format files

* remove unused imports

* fix lint errors

* address PR comments

* sorted imports

* add space

* modify comment

* added doc strings and addressed PR comments.

* addressed PR comments

* added another comment to clarify.

* fixing lint errors

* rename variable
Co-authored-by: Anjali Sridhar <anj@devfair0443.h2.fair>

3c727ec5

[fix] Dead code removal for OSS (#276) · fb8d9137

Benjamin Lefaudeux authored Dec 29, 2020

* removing a dead call since ShardedDDP, small speedup
* unrelated, but filling in the changelog
* another nit

fb8d9137

16 Dec, 2020 1 commit

[feat] add CPU support to tutorials in examples + factorize tutorials (#247) · 02478eb3

jessijzhao authored Dec 15, 2020

* [feat] add CPU support to tutorials in examples

* now works on a machine without cuda
* fixes some minor typos

* [cleanup] factorize tutorials in examples

* collects duplicate code across tutorials in helpers.py

* [fix] getData in tutorials now returns iterable

02478eb3

01 Dec, 2020 1 commit
- [chore] Refactor unit testing, shared utils (#218) · e83da060
  Benjamin Lefaudeux authored Dec 01, 2020
  
  e83da060
22 Nov, 2020 1 commit

[fix] More robust stats for regression testing (#204) · 2b121242

Benjamin Lefaudeux authored Nov 22, 2020

* testing median and MAD

* synchronize on kernels to make sure that we're measuring the actual completion time

* adjusting the circleci threshold, not that the speed has regressed but because we measure proper cuda execution time

2b121242

21 Nov, 2020 1 commit

[feat] ShardedDataParallel with autoreduce (#157) · ad933b34

Benjamin Lefaudeux authored Nov 21, 2020

* rewrite using autograd and Variable execution queue to make the reduce automatic
* share buckets with OSS to remove duplication
* some speed still likely on the table since the speed vs. bucketing does not match expectations, could be a follow up

ad933b34

19 Nov, 2020 2 commits
- [fix] Reverting a change which slipped in #188 (#198) · ba367d39
  Benjamin Lefaudeux authored Nov 18, 2020
```
* reverting a change which slipped in #188
```
  ba367d39
- [feat] Add CPU support for pipe.py benchmarks (#188) · a842a927
  Yuanyuan (Ana) Shen authored Nov 18, 2020
```
* Add CPU support for pipe.py benchmarks, CUDA-free
```
  a842a927
18 Nov, 2020 1 commit

[feat] ShardedOptim: Distributed Grad Scaler (for torch AMP) (#182) · d85acf72

Benjamin Lefaudeux authored Nov 17, 2020

* adding a shard-aware GradScaler wrap, credits to Sean Naren for the idea
* adding stubs & explanations in the documentation

d85acf72

16 Nov, 2020 1 commit
- [feat] OSS-aware clip grads, bridge sharded states (#167) · ade312c4
  Benjamin Lefaudeux authored Nov 16, 2020
```
add a clip gradients util, equivalent to torch's but aware of the sharded states. Add a corresponding unit test
```
  ade312c4
12 Nov, 2020 1 commit
- [fix] Pure cpu support for benchmarks/oss.py (#185) · 2fe93203
  Yuanyuan (Ana) Shen authored Nov 12, 2020
```
* now works on a machine without cuda, easier to debug and quick test
```
  2fe93203
10 Nov, 2020 1 commit

Single-process control via PipeRPCWrapper (#156) · 5d4f50fb

Tom Birch authored Nov 10, 2020

Adds support for:
* Reused layers (e.g. for weight sharing)
* Lazily-constructed layers
* Single-process control via PipeRPCWrapper
* PipelineStyle.AsyncScheudle, which lays the foundation for asynchronous pipeline work by introducing an event loop for each rank/worker to process either activations or gradients as they arrive

Also added examples for multi-process and PipeRPCWrapper

5d4f50fb

06 Nov, 2020 1 commit
- [feature] Add a torch AMP benchmark option and test job (#175) · cc766aa5
  Benjamin Lefaudeux authored Nov 05, 2020
```
* oss benchmark: add an --amp option
* add a circleCI test
```
  cc766aa5
28 Oct, 2020 1 commit
- [chore] update isort to 5.6.4 (#170) · ea9876e3
  msbaines authored Oct 27, 2020
  
  ea9876e3
23 Oct, 2020 1 commit
- [feat][minor] OSS Benchmark - add a debug option to add some tensor dumps (#166) · 34f35fba
  Benjamin Lefaudeux authored Oct 23, 2020
```
* Some ease of use in the benchmark tool, add a debug option
```
  34f35fba
21 Oct, 2020 1 commit

[feature] OSS: Use MNIST to benchmark (#159) · 6f8a8652

Benjamin Lefaudeux authored Oct 21, 2020

* switching to MNIST
* updating the reference values, should be good to go
* download dataset once for all processes

6f8a8652

20 Oct, 2020 1 commit
- [feat][minor] OSS benchmark - pick the model via args (#152) · 49a3d9bc
  Benjamin Lefaudeux authored Oct 20, 2020
```
* Minor, ease of life to debug and makes it possible to test a host of models with the same code
```
  49a3d9bc
17 Oct, 2020 1 commit
- [feat][minor] OSS: benchmark - adding a cpu option (#144) · 10062e58
  Benjamin Lefaudeux authored Oct 16, 2020
```
* adding a cpu option
* adjust the reference loss
```
  10062e58
14 Oct, 2020 1 commit
- [feat] OSS: adding a --profile option to the benchmark (#135) · 34915bf8
  Benjamin Lefaudeux authored Oct 14, 2020
  
  34915bf8
10 Oct, 2020 1 commit
- [bugfix] OSS no reduce loss (#133) · 177151e0
  Benjamin Lefaudeux authored Oct 09, 2020
```
* bugfix
* adjust default non-regression loss, not all_reduced now
```
  177151e0
09 Oct, 2020 1 commit
- [minor] OSS: bring DDP in the benchmark (#130) · bfd88cad
  Benjamin Lefaudeux authored Oct 08, 2020
```
More realistic benchmarks, comparing apples to apples. DDP/OSS+DDP/OSS+SDP
```
  bfd88cad
06 Oct, 2020 1 commit

[feat] OSS/SDP : bucketing (#122) · 341d8b2b

Benjamin Lefaudeux authored Oct 05, 2020

Same bucketing strategy for OSS and SDP:
sort everything ahead of time, per rank and per size, smaller tensors first. Bucket the smallest elements in a fixed buffer, send async, then send all the others async, and get back to the bucket. Once done then scatter the contents if needed

341d8b2b

29 Sep, 2020 1 commit
- [ShardedDDP] Sync buffers + small cleanup (#112) · 79ded821
  Benjamin Lefaudeux authored Sep 28, 2020
```
- adding the buffer broadcast option
- minor cleanup in shardedDDP
```
  79ded821
24 Sep, 2020 1 commit

[fix] OSS benchmark cleanup (#109) · 53553474

Benjamin Lefaudeux authored Sep 24, 2020

- small benchmark refactor, only one for all backends and ddp
- deterministic, enforce alignment with pytorch ddp

53553474

22 Sep, 2020 2 commits
- [bug] Make OSS Gloo-compliant (#102) · b488dcfa
  Benjamin Lefaudeux authored Sep 22, 2020
```
* Broadcasting grad-enabled tensors is forbidden in Gloo, because this is not differentiable. Workaround
```
  b488dcfa
- [chore] OSS doc (#101) · d80c38f9
  Benjamin Lefaudeux authored Sep 22, 2020
```
* Doc extensions to some APIs
* FIx the benchmark and tutorial
```
  d80c38f9