Commits · 06e75b3a515cf3d69bb3d00bcebbb67b90af0449 · OpenDAS / FastMoE

08 Feb, 2021 11 commits
- Seperate ddp test from test_numerical · 06e75b3a
  Sengxian authored Feb 08, 2021
  
  06e75b3a
- Merge pull request #3 from laekov/laekov/benchmarks · 1a72a0cb
  Rick Ho authored Feb 08, 2021
```
separate benchmark with tests
```
  1a72a0cb
- adapt benchmark to new moe module · 101b847c
  Rick Ho authored Feb 08, 2021
  
  101b847c
- merge new tests · 8dac1a52
  Rick Ho authored Feb 08, 2021
  
  8dac1a52
- Add DataParallel test for FMoE · 34477955
  Sengxian authored Feb 08, 2021
  
  34477955
- Fix top_k=3 testcases · 40841453
  Sengxian authored Feb 08, 2021
  
  40841453
- Merge pull request #2 from test · baf2b118
  Rick Ho authored Feb 08, 2021
```
Add test for FMoE layer
```
  baf2b118
- Fix bug in DDP test · 184b0404
  Sengxian authored Feb 08, 2021
  
  184b0404
- Add test for arbitrary expert in FMoE · 5ead59db
  Sengxian authored Feb 08, 2021
  
  5ead59db
- a stronger benchmark · d2678111
  Rick Ho authored Feb 08, 2021
  
  d2678111
- newer test with top-k bug fixed · c5556037
  Rick Ho authored Feb 08, 2021
  
  c5556037
07 Feb, 2021 6 commits
- Add test for FMoE · fc78d5c3
  Sengxian authored Feb 07, 2021
  
  fc78d5c3
- strong compatibility to different python and pytorch versions · 103343ca
  Rick Ho authored Feb 07, 2021
  
  103343ca
- fix typo · d6169eb1
  Rick Ho authored Feb 07, 2021
  
  d6169eb1
- update the mem-transformer example · fb3e3c29
  Rick Ho authored Feb 07, 2021
  
  fb3e3c29
- support arbitrary module as expert · b3380ec2
  Rick Ho authored Feb 07, 2021
  
  b3380ec2
- separate gates file · 8328c794
  Rick Ho authored Feb 07, 2021
  
  8328c794
05 Feb, 2021 8 commits
- fit fmoe in transformer-xl · 9c92be55
  Rick Ho authored Feb 05, 2021
  
  9c92be55
- do not require comm in non-nccl environment · 5e9bb2e9
  Rick Ho authored Feb 05, 2021
  
  5e9bb2e9
- readme in transformer-xl example · 8f1f2ca5
  Rick Ho authored Feb 05, 2021
  
  8f1f2ca5
- update instructions for megatron · 59b27103
  Rick Ho authored Feb 05, 2021
  
  59b27103
- sync in the whole world instead of mp world in megatron · d6e7a429
  Rick Ho authored Feb 05, 2021
  
  d6e7a429
- pass pylint · f2040d9f
  Rick Ho authored Feb 05, 2021
  
  f2040d9f
- support multiple pytorch versions prviate apis · bf2fd0c0
  Rick Ho authored Feb 05, 2021
  
  bf2fd0c0
- add functions to support checkpointing in megatron ddp · 481f5c4f
  Rick Ho authored Feb 05, 2021
  
  481f5c4f
04 Feb, 2021 6 commits
- fix pytorch header compilation bug · 79ccb7b6
  Rick Ho authored Feb 04, 2021
  
  79ccb7b6
- adapt with pytorch 1.8.0 (deprecated 1.6.0) · 15f98a10
  Rick Ho authored Feb 04, 2021
  
  15f98a10
- setup pylint and write docs for functions · 585604fe
  Rick Ho authored Feb 04, 2021
  
  585604fe
- fix no grad after all-gather bug · 56c1bd63
  Rick Ho authored Feb 04, 2021
  
  56c1bd63
- use parallel label in gate · d83234b0
  Rick Ho authored Feb 04, 2021
  
  d83234b0
- ddp module for sophiscated hybrid parallel · 67c667f2
  Rick Ho authored Feb 04, 2021
  
  67c667f2
03 Feb, 2021 3 commits
- fix ensure device index bug · ea66e5e5
  Rick Ho authored Feb 03, 2021
  
  ea66e5e5
- fix pure data parallel · ae2c434e
  Rick Ho authored Feb 03, 2021
  
  ae2c434e
- fmoefy · 6b8d2f2e
  Rick Ho authored Feb 03, 2021
  
  6b8d2f2e
02 Feb, 2021 5 commits
- fix bmm out shape · 4b650671
  Rick Ho authored Feb 02, 2021
  
  4b650671
- fix replica condition and minor optimizations · dc3db673
  Rick Ho authored Feb 02, 2021
  
  dc3db673
- remove debug output and todo for replicated mp input · a8ecd3d7
  Rick Ho authored Feb 02, 2021
  
  a8ecd3d7
- Optimize redundancy communication · 01ae2d72
  Sengxian authored Feb 02, 2021
  
  01ae2d72
- Format using black and add model_parallel_rank · fdbac1df
  Sengxian authored Feb 02, 2021
  
  fdbac1df
01 Feb, 2021 1 commit
- add megatron example · ae658b89
  Rick Ho authored Feb 01, 2021
  
  ae658b89