- 24 Nov, 2020 1 commit
-
-
Stas Bekman authored
* make the basic example usable out of the box * clarify
-
- 22 Nov, 2020 1 commit
-
-
Benjamin Lefaudeux authored
* testing median and MAD * synchronize on kernels to make sure that we're measuring the actual completion time * adjusting the circleci threshold, not that the speed has regressed but because we measure proper cuda execution time
-
- 21 Nov, 2020 1 commit
-
-
Benjamin Lefaudeux authored
* rewrite using autograd and Variable execution queue to make the reduce automatic * share buckets with OSS to remove duplication * some speed still likely on the table since the speed vs. bucketing does not match expectations, could be a follow up
-
- 20 Nov, 2020 1 commit
-
-
msbaines authored
-
- 19 Nov, 2020 4 commits
-
-
msbaines authored
Fixes #190
-
msbaines authored
-
Benjamin Lefaudeux authored
* reverting a change which slipped in #188
-
Yuanyuan (Ana) Shen authored
* Add CPU support for pipe.py benchmarks, CUDA-free
-
- 18 Nov, 2020 2 commits
-
-
Tom Birch authored
-
Benjamin Lefaudeux authored
* adding a shard-aware GradScaler wrap, credits to Sean Naren for the idea * adding stubs & explanations in the documentation
-
- 17 Nov, 2020 1 commit
-
-
Min Xu authored
- removed experimental warning as we have validated it on cifar and imagenet, transformer is looking good so far too. - fixed API doc formatting - make it consistent with the other code in the repo - tested by making the doc locally and inspect the results
-
- 16 Nov, 2020 1 commit
-
-
Benjamin Lefaudeux authored
add a clip gradients util, equivalent to torch's but aware of the sharded states. Add a corresponding unit test
-
- 12 Nov, 2020 2 commits
-
-
Yuanyuan (Ana) Shen authored
* now works on a machine without cuda, easier to debug and quick test
-
msbaines authored
-
- 11 Nov, 2020 2 commits
- 10 Nov, 2020 1 commit
-
-
Tom Birch authored
Adds support for: * Reused layers (e.g. for weight sharing) * Lazily-constructed layers * Single-process control via PipeRPCWrapper * PipelineStyle.AsyncScheudle, which lays the foundation for asynchronous pipeline work by introducing an event loop for each rank/worker to process either activations or gradients as they arrive Also added examples for multi-process and PipeRPCWrapper
-
- 06 Nov, 2020 2 commits
-
-
Benjamin Lefaudeux authored
-
Benjamin Lefaudeux authored
* oss benchmark: add an --amp option * add a circleCI test
-
- 04 Nov, 2020 1 commit
-
-
msbaines authored
-
- 30 Oct, 2020 2 commits
- 29 Oct, 2020 1 commit
-
-
msbaines authored
-
- 28 Oct, 2020 2 commits
- 26 Oct, 2020 1 commit
-
-
Min Xu authored
-
- 23 Oct, 2020 3 commits
-
-
Benjamin Lefaudeux authored
* Some ease of use in the benchmark tool, add a debug option
-
Benjamin Lefaudeux authored
* small refactor, getting rid of the while loop
-
msbaines authored
-
- 22 Oct, 2020 3 commits
-
-
Vittorio Caggiano authored
-
Vittorio Caggiano authored
fix broken link
-
Benjamin Lefaudeux authored
-
- 21 Oct, 2020 7 commits
-
-
Min Xu authored
- Aurick noticed this bug and I ran into it yesterday - after the fix, our cifar training shows same gain values from different replics now: ``` 20-Oct-20 16:00:19 - DEBUG - rank1 - scale 2, gain ratio 1.3512124098087777 20-Oct-20 16:00:19 - DEBUG - rank0 - scale 2, gain ratio 1.3512124098087777 20-Oct-20 16:00:19 - DEBUG - rank1 - timing: data 0:00:00.000600 fwd 0:00:00.003678 loss 0:00:00.000086 bwd 0:00:00.314158 update 0:00:00.002132 rest 0:00:00.000399 20-Oct-20 16:00:19 - DEBUG - rank0 - timing: data 0:00:00.000643 fwd 0:00:00.003460 loss 0:00:00.000084 bwd 0:00:00.314678 update 0:00:00.002001 rest 0:00:00.000408 20-Oct-20 16:00:19 - DEBUG - rank1 - scale 2, gain ratio 1.3514997779980324 20-Oct-20 16:00:19 - DEBUG - rank0 - scale 2, gain ratio 1.3514997779980324 20-Oct-20 16:00:19 - DEBUG - rank1 - timing: data 0:00:00.000732 fwd 0:00:00.003689 loss 0:00:00.000086 bwd 0:00:00.314176 update 0:00:00.002146 rest 0:00:00.000397 20-Oct-20 16:00:19 - DEBUG - rank0 - timing: data 0:00:00.000646 fwd 0:00:00.003542 loss 0:00:00.000089 bwd 0:00:00.314549 update 0:00:00.001956 rest 0:00:00.000392 20-Oct-20 16:00:19 - DEBUG - rank1 - scale 2, gain ratio 1.352149646693932 20-Oct-20 16:00:19 - DEBUG - rank0 - scale 2, gain ratio 1.352149646693932 ```
-
Benjamin Lefaudeux authored
* switching to MNIST * updating the reference values, should be good to go * download dataset once for all processes
-
Vittorio Caggiano authored
fix max depth
-
Vittorio Caggiano authored
fix maxdepth
-
Vittorio Caggiano authored
-
Vittorio Caggiano authored
* wip_example * [wip]mnist_pipe_example * [wip]mnist_pipe_example * [wip]mnist_pipe_example * [wip]mnist_pipe_example * [wip]mnist_oss_example * working prototype * added tutorial script * update tutorial * Update mnist_test_oss.py * Update mnist_test_oss.py * Update mnist_test_oss.py * Update mnist_test_pipe.py * Update tutorial_oss.py * Update tutorial_pipe.py * Update tutorial_pipe.py * Update mnist_test_oss.py * Update tutorial_pipe.py * Update mnist_test_pipe.py * Update tutorial_pipe.py * fix black * fix flacke8 * general fixes * add example oss+pipe * fix isort * Update mnist_test_pipe.py * fix black Co-authored-by:Vittorio Caggiano <caggiano@devfair0253.h2.fair>
-
msbaines authored
-
- 20 Oct, 2020 1 commit
-
-
Benjamin Lefaudeux authored
* Minor, ease of life to debug and makes it possible to test a host of models with the same code
-