- 24 Sep, 2022 2 commits
-
-
Jared Casper authored
-
Jared Casper authored
-
- 23 Sep, 2022 5 commits
-
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
Also merged in some changed from apex
-
Jared Casper authored
Clean up licensing. See merge request ADLR/megatron-lm!451
-
Jared Casper authored
The LICENSE file says everything is 3-clause BSD, which is what we want, but at some point the Apache license was added to the top of some files and that proliferated. This commit removes the Apache license from any files that we own the copyright to. Also updates the copyright year and removes the unnessary coding=utf-8 line.
-
- 12 Sep, 2022 2 commits
-
-
Jared Casper authored
Update state_dict arguments for recent PyTorch versions. See merge request ADLR/megatron-lm!432
-
Jared Casper authored
Memory safety checks were incorrect for the tokens_to_generate=0 case See merge request ADLR/megatron-lm!447
-
- 02 Sep, 2022 1 commit
-
-
rprenger authored
-
- 16 Aug, 2022 4 commits
-
-
Jared Casper authored
fixed grad scalar warning for bf16 See merge request ADLR/megatron-lm!442
-
Mohammad Shoeybi authored
-
Jared Casper authored
fixed grad scalar warning so it only prints it for fp16 See merge request ADLR/megatron-lm!441
-
mshoeybi authored
-
- 10 Aug, 2022 2 commits
-
-
Jared Casper authored
Timing levels See merge request ADLR/megatron-lm!436
-
Mohammad Shoeybi authored
-
- 06 Aug, 2022 2 commits
-
-
Jared Casper authored
fix a bug for size mismatch See merge request ADLR/megatron-lm!438
-
Peng Xu authored
-
- 29 Jul, 2022 1 commit
-
-
Jared Casper authored
support for all mask in fused kernel + avoiding inplace operation in bwd pass See merge request ADLR/megatron-lm!435
-
- 28 Jul, 2022 1 commit
-
-
Vijay Korthikanti authored
-
- 27 Jul, 2022 1 commit
-
-
Jared Casper authored
added a flag to be able to switch between pytorch and ring exchange p2p See merge request ADLR/megatron-lm!434
-
- 26 Jul, 2022 3 commits
-
-
Mohammad Shoeybi authored
-
Jared Casper authored
Remove old merge tool. See merge request ADLR/megatron-lm!433
-
Jared Casper authored
-
- 21 Jul, 2022 1 commit
-
-
Jared Casper authored
Remove deprecated destination argument to state_dict functions and make all arguments keyword to avoid warnings.
-
- 20 Jul, 2022 2 commits
-
-
Jared Casper authored
Adding checks for total number of tokens to keep server from crashing See merge request ADLR/megatron-lm!428
-
Jared Casper authored
Implements the top_p decay and top_p bound parameters so from the Factual Sampling work See merge request ADLR/megatron-lm!423
-
- 19 Jul, 2022 5 commits
-
-
Jared Casper authored
Distributed optimizer readme section. See merge request ADLR/megatron-lm!429
-
Jared Casper authored
-
Lawrence McAfee authored
-
Lawrence McAfee authored
-
Lawrence McAfee authored
-
- 18 Jul, 2022 4 commits
-
-
rprenger authored
-
rprenger authored
-
Jared Casper authored
Checkpoint conversion utility See merge request ADLR/megatron-lm!289
-
Jared Casper authored
Fix bugs for beam search when using pipeline parallelization See merge request ADLR/megatron-lm!426
-
- 15 Jul, 2022 3 commits
- 12 Jul, 2022 1 commit
-
-
Jared Casper authored
-