- 21 Dec, 2020 1 commit
-
-
Deepak Narayanan authored
Pipeline parallelism and inter-layer model parallelism implementation See merge request ADLR/megatron-lm!159
-
- 19 Dec, 2020 28 commits
-
-
Jared Casper authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mshoeybi authored
-
mshoeybi authored
-
mshoeybi authored
-
mshoeybi authored
-
mshoeybi authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Deepak Narayanan authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
Rename --batch-size to --micro-batch-size and drop in-minibatch from --num-micro-batches-in-minibatch
-
Jared Casper authored
-
- 03 Dec, 2020 4 commits
-
-
Jared Casper authored
-
Jared Casper authored
found a bug in consumed tokens initialization See merge request ADLR/megatron-lm!182
-
mohammad authored
-
Jared Casper authored
-
- 02 Dec, 2020 7 commits
-
-
Jared Casper authored
Simplified sampler (will be needed later for batch size increase) and removed deprecated data stuff See merge request ADLR/megatron-lm!177
-
Jared Casper authored
Blendable dataset See merge request ADLR/megatron-lm!178
-
Jared Casper authored
Refactor learning rate so it is easier to make learning rate based on consumed samples See merge request ADLR/megatron-lm!179
-
mohammad authored
-
mohammad authored
-
mohammad authored
-
mohammad authored
-