- 21 Aug, 2021 4 commits
- 19 Aug, 2021 9 commits
-
-
Jared Casper authored
Checkpoint a set number of individual Transformer layers See merge request ADLR/megatron-lm!301
-
Jared Casper authored
Memory optimizations for contiguous buffers See merge request ADLR/megatron-lm!310
-
slym authored
-
slym authored
-
slym authored
consider the case of pipeline-model prallelism clean up arugments argument naming cleanup update readme and examples
-
mshoeybi authored
-
mshoeybi authored
-
mshoeybi authored
-
mshoeybi authored
-
- 18 Aug, 2021 3 commits
-
-
mshoeybi authored
-
Jared Casper authored
bug fix for previous MR (Flag to call empty_cache() each iteration, to reduce fragmentation) See merge request ADLR/megatron-lm!309
-
Lawrence McAfee authored
-
- 17 Aug, 2021 14 commits
-
-
Jared Casper authored
minor changes from github issues See merge request ADLR/megatron-lm!308
-
mshoeybi authored
-
Jared Casper authored
Flag to call empty_cache() each iteration, to reduce fragmentation See merge request ADLR/megatron-lm!306
-
Jared Casper authored
simplified the iteration read check across ranks See merge request ADLR/megatron-lm!307
-
Lawrence McAfee authored
-
mshoeybi authored
-
Jared Casper authored
some small PRs from github See merge request ADLR/megatron-lm!305
-
Jared Casper authored
added across ranks sync for checkpoint iteration load and couple of other fixes See merge request ADLR/megatron-lm!304
-
-
mshoeybi authored
Merge branch 'update-sample-bert-pretrain-args' of https://github.com/roclark/Megatron-LM into github_small_prs
-
-
-
-
mshoeybi authored
added across rank sync for checkpoint iteration laod, fixed type for timing, and validation iterations
-
- 16 Aug, 2021 5 commits
-
-
Jared Casper authored
changed torch distributed init method from tcp to env See merge request ADLR/megatron-lm!297
-
Mohammad Shoeybi authored
-
eqy authored
Some tests expect a clean model parallel slate and complain if a previous test left something behind; this change clears more variables that the tests complain about.
-
Lawrence McAfee authored
-
Lawrence McAfee authored
-
- 13 Aug, 2021 1 commit
-
-
Jared Casper authored
removed saving of input/output tensors for forward-only passes of pipeline schedules See merge request ADLR/megatron-lm!300
-
- 12 Aug, 2021 4 commits
-
-
Jared Casper authored
Lmcafee/copygrad fix v2 See merge request ADLR/megatron-lm!299
-
Jared Casper authored
fused kernels simd fix for short sequences See merge request ADLR/megatron-lm!302
-
Vijay Korthikanti authored
-
Vijay Korthikanti authored
-