- 19 Dec, 2020 17 commits
-
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Deepak Narayanan authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
Rename --batch-size to --micro-batch-size and drop in-minibatch from --num-micro-batches-in-minibatch
-
Jared Casper authored
-
- 03 Dec, 2020 4 commits
-
-
Jared Casper authored
-
Jared Casper authored
found a bug in consumed tokens initialization See merge request ADLR/megatron-lm!182
-
mohammad authored
-
Jared Casper authored
-
- 02 Dec, 2020 8 commits
-
-
Jared Casper authored
Simplified sampler (will be needed later for batch size increase) and removed deprecated data stuff See merge request ADLR/megatron-lm!177
-
Jared Casper authored
Blendable dataset See merge request ADLR/megatron-lm!178
-
Jared Casper authored
Refactor learning rate so it is easier to make learning rate based on consumed samples See merge request ADLR/megatron-lm!179
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
- 30 Nov, 2020 2 commits
- 29 Nov, 2020 3 commits
- 28 Nov, 2020 1 commit
-
-
mohammad authored
-
- 26 Nov, 2020 1 commit
-
-
mohammad authored
-
- 19 Nov, 2020 1 commit
-
-
Jared Casper authored
-
- 18 Nov, 2020 3 commits
-
-
Mohammad Shoeybi authored
Replace deprecated torch.norm with torch.linalg.norm. See merge request ADLR/megatron-lm!175
-
Mohammad Shoeybi authored
Community fixes See merge request ADLR/megatron-lm!176
-
-