Commits · aed2f75e209e525c842aec7c044af7acae2a4614 · OpenDAS / Megatron-LM

22 Jan, 2021 1 commit
- attention_mask_func cleanup · ebf8b89e
  Vijay Korthikanti authored Jan 22, 2021
  
  ebf8b89e
13 Jan, 2021 1 commit
- Adressing more review comments · 4ae54b55
  Vijay Korthikanti authored Jan 12, 2021
  
  4ae54b55
08 Jan, 2021 1 commit
- vision transformer model and vision classification task · 456f1728
  Vijay Korthikanti authored Jan 08, 2021
  
  456f1728
27 Dec, 2020 1 commit
- moved module to model and removed fp16 · b84d7a90
  mohammad authored Dec 26, 2020
  
  b84d7a90
12 Nov, 2020 5 commits
- Refactor code according to Jared's comments: move pipelining and... · 1979c242
  Deepak Narayanan authored Nov 12, 2020
```
Refactor code according to Jared's comments: move pipelining and non-pipelining training loops into separate methods

Also, use mpu.get_*_model_parallel_size() instead of args.*_model_parallel_size
```
  1979c242
- Clarifications in comments and minor refactoring to make main training loop more readable · 8fb2bc8c
  Deepak Narayanan authored Oct 28, 2020
  
  8fb2bc8c
- Removal of unneeded changes so that diff is smaller · 275d4e64
  Deepak Narayanan authored Oct 20, 2020
  
  275d4e64
- Intra-layer MP -> Tensor MP, Inter-layer MP -> Pipeline MP · 52a5f2f2
  Deepak Narayanan authored Oct 20, 2020
  
  52a5f2f2
- Pipeline parallelism implementation with periodic full-pipeline syncs · 7abd3e90
  Deepak Narayanan authored Aug 29, 2020
```
Also includes following changes for inter-layer model-parallel implementation:
- Refactoring of model implementations
- Training loop changes to support inter-layer communication using `ring_exchange`
- New groups for inter-layer communication
- Checkpoint changes
- Command line arguments
```
  7abd3e90
23 Jul, 2020 1 commit
- Address comments from last week · 11f76cd3
  Neel Kant authored Jul 22, 2020
  
  11f76cd3
07 Jul, 2020 1 commit
- Addressed Jared's comments · 8d7f508a
  Neel Kant authored Jul 06, 2020
  
  8d7f508a
29 Jun, 2020 1 commit
- Create tensors on cuda rather than copying · 76928caa
  Neel Kant authored Jun 29, 2020
  
  76928caa
24 Jun, 2020 1 commit
- More ict_merge changes and interactive testing · 3354081f
  Neel Kant authored Jun 23, 2020
  
  3354081f
22 Jun, 2020 1 commit
- Address most of comments from code review w/ Mohammad · 6495927e
  Neel Kant authored Jun 22, 2020
  
  6495927e
10 Jun, 2020 1 commit
- Ran and cleaned up · fcc500d6
  Neel Kant authored Jun 10, 2020
  
  fcc500d6
09 Jun, 2020 1 commit
- Correct retrieval utility and add salient span preprocessing · bf599e86
  Neel Kant authored Jun 08, 2020
  
  bf599e86
05 Jun, 2020 3 commits
- Some changes noticed late · c044f59a
  Neel Kant authored Jun 05, 2020
  
  c044f59a
- Prune changes to only be related to ICT · 32bb4edc
  Neel Kant authored Jun 05, 2020
  
  32bb4edc
- runs with new log loss but plateaus early · 20895f2c
  Neel Kant authored Jun 04, 2020
  
  20895f2c
31 May, 2020 1 commit
- Misc changes · 51204a4d
  Neel Kant authored May 30, 2020
  
  51204a4d
27 May, 2020 1 commit
- Correct CrossEntropyLoss · 0e8f4331
  Neel Kant authored May 27, 2020
  
  0e8f4331
26 May, 2020 1 commit
- Corrected realm example building, misc improvements for async concurrency · 2fd4ea6c
  Neel Kant authored May 25, 2020
  
  2fd4ea6c
24 May, 2020 1 commit
- Fix token alignment, add mpu checkpointing, misc training code · 8e22824e
  Neel Kant authored May 24, 2020
  
  8e22824e
20 May, 2020 1 commit
- Async works for total 8 GPU, indexer debug mode · a670b6c9
  Neel Kant authored May 19, 2020
  
  a670b6c9
19 May, 2020 1 commit
- Full cycle of communication complete. Also added BasicIndexBuilder · 5684f904
  Neel Kant authored May 19, 2020
  
  5684f904
14 May, 2020 2 commits
- Add retrieval utility and autoresume for indexer · 2f7d666c
  Neel Kant authored May 14, 2020
  
  2f7d666c
- Minor adjustments to fit QA codebase · 9b9b8e01
  Neel Kant authored May 14, 2020
  
  9b9b8e01
12 May, 2020 1 commit
- faiss use_gpu · 6e256445
  Neel Kant authored May 12, 2020
  
  6e256445
07 May, 2020 2 commits
- Add primitive filesystem-based IPC for indexer and trainer jobs · 15d0d55b
  Neel Kant authored May 07, 2020
  
  15d0d55b
- Add REALMAnswerSpanModel and MLM features · f2094783
  Neel Kant authored May 06, 2020
  
  f2094783
05 May, 2020 4 commits
- Debug null document and exclude trivial candidate · c17d880c
  Neel Kant authored May 05, 2020
  
  c17d880c
- Add null block and exclude trivial block · efcee158
  Neel Kant authored May 05, 2020
  
  efcee158
- Refactor and add more REALM arguments · 730266ca
  Neel Kant authored May 05, 2020
  
  730266ca
- Move REALM to use FAISS · a2e64ad5
  Neel Kant authored May 04, 2020
  
  a2e64ad5
03 May, 2020 1 commit
- more for pretrain_realm · 59031aa7
  Neel Kant authored May 03, 2020
  
  59031aa7
29 Apr, 2020 1 commit
- Update and test misc functionality · 6c0a5bd8
  Neel Kant authored Apr 28, 2020
  
  6c0a5bd8
24 Apr, 2020 3 commits
- Add test_retriever.sh · 1eccfc94
  Neel Kant authored Apr 24, 2020
  
  1eccfc94
- Mostly debugged realm-mlm · d7022c72
  Neel Kant authored Apr 24, 2020
  
  d7022c72
- Revise REALMBertModel and REALMRetriever · f42b4d24
  Neel Kant authored Apr 23, 2020
  
  f42b4d24
23 Apr, 2020 1 commit
- Debug test_retriever · ac79d374
  Neel Kant authored Apr 23, 2020
  
  ac79d374