- 29 Dec, 2020 1 commit
-
-
Benjamin Lefaudeux authored
* catching properly a given test failing if not enough gpus
-
- 28 Dec, 2020 1 commit
-
-
Benjamin Lefaudeux authored
* file based dist init * nicer handling of broken world sizes vs. number of available GPUs, do not break but warn out
-
- 19 Dec, 2020 1 commit
-
-
Benjamin Lefaudeux authored
[OSS] Getting rid of the "should bucket" hash table, just use a list + non trainable params fix (#259) * Getting rid of the "should bucket" hash table, just use a list Properly handle all params, with or without requires_grad * make sure that this case is unit tested
-
- 10 Dec, 2020 1 commit
-
-
Benjamin Lefaudeux authored
* unit test checking ddp and sharded_ddp equivalence, reproducing the issue that Sean spotted * fixing the issue, not counting requests in flight properly * adding a multiple optimizers case
-
- 04 Dec, 2020 1 commit
-
-
Benjamin Lefaudeux authored
* proper unit testing, but no other solution than disabling bucketing for now, couple of options tested do not work
-
- 01 Dec, 2020 2 commits
-
-
Benjamin Lefaudeux authored
-
Benjamin Lefaudeux authored
* fallback on internal pytorch numbering
-
- 21 Nov, 2020 1 commit
-
-
Benjamin Lefaudeux authored
* rewrite using autograd and Variable execution queue to make the reduce automatic * share buckets with OSS to remove duplication * some speed still likely on the table since the speed vs. bucketing does not match expectations, could be a follow up
-
- 18 Nov, 2020 1 commit
-
-
Tom Birch authored
-
- 11 Nov, 2020 2 commits
- 10 Nov, 2020 1 commit
-
-
Tom Birch authored
Adds support for: * Reused layers (e.g. for weight sharing) * Lazily-constructed layers * Single-process control via PipeRPCWrapper * PipelineStyle.AsyncScheudle, which lays the foundation for asynchronous pipeline work by introducing an event loop for each rank/worker to process either activations or gradients as they arrive Also added examples for multi-process and PipeRPCWrapper
-
- 30 Oct, 2020 1 commit
-
-
msbaines authored
-
- 29 Oct, 2020 1 commit
-
-
msbaines authored
-
- 23 Oct, 2020 1 commit
-
-
msbaines authored
-
- 21 Oct, 2020 1 commit
-
-
msbaines authored
-
- 20 Oct, 2020 1 commit
-
-
Min Xu authored
- fixed typing - make it run less often to reduce CI time testing: run it in a loop make sure it is run in the right frequency.
-
- 17 Oct, 2020 1 commit
-
-
msbaines authored
-
- 16 Oct, 2020 2 commits
- 14 Oct, 2020 1 commit
-
-
msbaines authored
-
- 08 Oct, 2020 2 commits
-
-
msbaines authored
Currently only implemented for a single process and expert.
-
Min Xu authored
* Add unittest for checkpoint & DDP - this change adds test cases to reproduce the error with checkpoint & DDP - mandeep mentioned that there is also deadlock in this case, but this change doesn't cover that. - we cover cases where weight sharing is OK - however, same module multiple checkpoint or find_unused_parameters are both not OK * added norm checks
-
- 06 Oct, 2020 1 commit
-
-
Benjamin Lefaudeux authored
Same bucketing strategy for OSS and SDP: sort everything ahead of time, per rank and per size, smaller tensors first. Bucket the smallest elements in a fixed buffer, send async, then send all the others async, and get back to the bucket. Once done then scatter the contents if needed
-
- 05 Oct, 2020 1 commit
-
-
msbaines authored
-
- 02 Oct, 2020 1 commit
-
-
msbaines authored
-
- 29 Sep, 2020 1 commit
-
-
Benjamin Lefaudeux authored
- adding the buffer broadcast option - minor cleanup in shardedDDP
-
- 17 Sep, 2020 2 commits
-
-
Tom Birch authored
Adds support for distributing pipeline stages across multiple processes (and therefore multiple machines) * Adds a style argument to the Pipe constructor, defaulting to PipelineStyle.SingleProcess, but also supporting PipelineStyle.MultiProcess * Added support for lazy construction of modules (see lazy_construction for an example) * Added two implementations of inter-process communication: one based on rpc with globally visible queues, one based on send/recv * Copied all the relevant tests from tests/pipe to tests/pipe_process and modified them to exercise PipelineStyle.MultiProcess
-
Benjamin Lefaudeux authored
- rename oss_ddp to ShardedDataParallel - some refactoring - ShardedDataParallel owns the sharded optimizer, exposed if need be - some small perf bumps
-
- 28 Aug, 2020 1 commit
-
-
Min Xu authored
- added train(mode) method to be aware of eval mode
-
- 06 Aug, 2020 1 commit
-
-
Min Xu authored
Co-authored-by:Min Xu <m1n@fb.com>
-
- 31 Jul, 2020 1 commit
-
-
Tom Birch authored
-
- 08 Jul, 2020 1 commit
-
-
Mandeep Singh Baines authored
-