1. 23 Oct, 2020 1 commit
  2. 21 Oct, 2020 1 commit
    • Min Xu's avatar
      [fix] fixing adascale all_reduce (#155) · 6802ad49
      Min Xu authored
      - Aurick noticed this bug and I ran into it yesterday
      - after the fix, our cifar training shows same gain values from
        different replics now:
      
      ```
      20-Oct-20 16:00:19 - DEBUG - rank1 - scale 2, gain ratio 1.3512124098087777
      20-Oct-20 16:00:19 - DEBUG - rank0 - scale 2, gain ratio 1.3512124098087777
      20-Oct-20 16:00:19 - DEBUG - rank1 - timing: data 0:00:00.000600 fwd 0:00:00.003678 loss 0:00:00.000086 bwd 0:00:00.314158 update 0:00:00.002132 rest 0:00:00.000399
      20-Oct-20 16:00:19 - DEBUG - rank0 - timing: data 0:00:00.000643 fwd 0:00:00.003460 loss 0:00:00.000084 bwd 0:00:00.314678 update 0:00:00.002001 rest 0:00:00.000408
      20-Oct-20 16:00:19 - DEBUG - rank1 - scale 2, gain ratio 1.3514997779980324
      20-Oct-20 16:00:19 - DEBUG - rank0 - scale 2, gain ratio 1.3514997779980324
      20-Oct-20 16:00:19 - DEBUG - rank1 - timing: data 0:00:00.000732 fwd 0:00:00.003689 loss 0:00:00.000086 bwd 0:00:00.314176 update 0:00:00.002146 rest 0:00:00.000397
      20-Oct-20 16:00:19 - DEBUG - rank0 - timing: data 0:00:00.000646 fwd 0:00:00.003542 loss 0:00:00.000089 bwd 0:00:00.314549 update 0:00:00.001956 rest 0:00:00.000392
      20-Oct-20 16:00:19 - DEBUG - rank1 - scale 2, gain ratio 1.352149646693932
      20-Oct-20 16:00:19 - DEBUG - rank0 - scale 2, gain ratio 1.352149646693932
      ```
      6802ad49
  3. 17 Sep, 2020 1 commit
    • Tom Birch's avatar
      Multi-process pipe (#90) · 63f7796a
      Tom Birch authored
      Adds support for distributing pipeline stages across multiple processes (and therefore multiple machines)
      * Adds a style argument to the Pipe constructor, defaulting to PipelineStyle.SingleProcess, but also supporting PipelineStyle.MultiProcess
      * Added support for lazy construction of modules (see lazy_construction for an example)
      * Added two implementations of inter-process communication: one based on rpc with globally visible queues, one based on send/recv
      * Copied all the relevant tests from tests/pipe to tests/pipe_process and modified them to exercise PipelineStyle.MultiProcess
      63f7796a
  4. 14 Aug, 2020 1 commit
  5. 31 Jul, 2020 2 commits
  6. 08 Jul, 2020 1 commit