- 30 Sep, 2021 1 commit
-
-
mshoeybi authored
-
- 29 Sep, 2021 3 commits
- 27 Sep, 2021 1 commit
-
-
mshoeybi authored
-
- 24 Sep, 2021 1 commit
-
-
mshoeybi authored
-
- 23 Sep, 2021 2 commits
- 22 Sep, 2021 3 commits
- 21 Sep, 2021 1 commit
-
-
mshoeybi authored
-
- 20 Sep, 2021 2 commits
-
-
Jared Casper authored
Inference context optimization See merge request ADLR/megatron-lm!321
-
Mohammad Shoeybi authored
-
- 17 Sep, 2021 3 commits
-
-
Jared Casper authored
Add temperature to the server API See merge request ADLR/megatron-lm!325
-
Robert Clark authored
A temperature value between 0.0 and 100.0 can now be specified via the API while running the text generation server. The value passed to --temperature while running the text generation server is kept as the default value for all API calls that don't include temperature, even if different values were manually specified previously. Signed-Off-By:Robert Clark <roclark@nvidia.com>
-
Jared Casper authored
Fixes a bug in broadcasting that was causing hanging See merge request ADLR/megatron-lm!327
-
- 14 Sep, 2021 5 commits
-
-
Ryan Prenger authored
Changing API to fix LM Harness Evaluation and make API less confusing/terrible See merge request ADLR/megatron-lm!326
-
rprenger authored
-
rprenger authored
-
rprenger authored
-
rprenger authored
-
- 13 Sep, 2021 1 commit
-
-
rprenger authored
Changing the interface to the lm eval harness and fixing bugs caused by misunderstanding out_seq_length
-
- 10 Sep, 2021 1 commit
-
-
Jared Casper authored
Reducing memory usage during inference See merge request ADLR/megatron-lm!320
-
- 08 Sep, 2021 2 commits
-
-
Jared Casper authored
This file doesn't work withthe new inference code, so it should be remove See merge request ADLR/megatron-lm!322
-
rprenger authored
-
- 07 Sep, 2021 2 commits
-
-
Jared Casper authored
use low-priority stream for nccl overlapping all-reduce and gemm See merge request ADLR/megatron-lm!319
-
Sangkug Lym authored
-
- 03 Sep, 2021 3 commits
-
-
rprenger authored
-
Jared Casper authored
allreduce overlap with wgrad gemm See merge request ADLR/megatron-lm!316
-
slym authored
-
- 02 Sep, 2021 4 commits
-
-
slym authored
-
slym authored
-
Sangkug Lym authored
allreduce overlap with wgrad gemm change custom delay to dummy add
-
Jared Casper authored
Letting server return the log-probabilities of the context and generated text See merge request ADLR/megatron-lm!317
-
- 01 Sep, 2021 2 commits
- 31 Aug, 2021 2 commits
- 27 Aug, 2021 1 commit
-
-
Jared Casper authored
Use nvfuser at pytorch >= 1.10 See merge request ADLR/megatron-lm!314
-