- 31 Oct, 2021 1 commit
-
-
Satpal Singh Rathore authored
-
- 07 Oct, 2021 1 commit
-
-
Jared Casper authored
Fix inference after T5 pipeline merge See merge request ADLR/megatron-lm!332
-
- 06 Oct, 2021 3 commits
-
-
Jared Casper authored
jit function argument type specification See merge request ADLR/megatron-lm!334
-
Jared Casper authored
THC/THCDeviceUtils.cuh -> ATen/cuda/DeviceUtils.cuh in fused layer norm See merge request ADLR/megatron-lm!333
-
Sangkug Lym authored
-
- 05 Oct, 2021 1 commit
-
-
Masaki Kozuki authored
-
- 01 Oct, 2021 2 commits
-
-
Ryan Prenger authored
Fixing bug where temperature was never actually broadcast See merge request ADLR/megatron-lm!330
-
Jared Casper authored
Adds some backward compatibility code so old inference code still works.
-
- 30 Sep, 2021 2 commits
-
-
Jared Casper authored
Pipeline parallelism for T5 model See merge request ADLR/megatron-lm!288
-
Mohammad Shoeybi authored
Remove outdated packaging files. See merge request ADLR/megatron-lm!331
-
- 29 Sep, 2021 2 commits
-
-
Jared Casper authored
-
Jared Casper authored
-
- 23 Sep, 2021 4 commits
-
-
rprenger authored
-
rprenger authored
-
Mohammad Shoeybi authored
Add Beginning of Sentence token option and adding semaphore while multi-threading to prevent crashes and hangs due to connection keep-alives See merge request ADLR/megatron-lm!328
-
rprenger authored
-
- 21 Sep, 2021 3 commits
-
-
Jared Casper authored
Fixing memory bug caused by DDP during task See merge request ADLR/megatron-lm!329
-
rprenger authored
-
rprenger authored
-
- 20 Sep, 2021 5 commits
-
-
rprenger authored
-
Jared Casper authored
Inference context optimization See merge request ADLR/megatron-lm!321
-
Mohammad Shoeybi authored
-
rprenger authored
-
rprenger authored
-
- 17 Sep, 2021 3 commits
-
-
Jared Casper authored
Add temperature to the server API See merge request ADLR/megatron-lm!325
-
Robert Clark authored
A temperature value between 0.0 and 100.0 can now be specified via the API while running the text generation server. The value passed to --temperature while running the text generation server is kept as the default value for all API calls that don't include temperature, even if different values were manually specified previously. Signed-Off-By:Robert Clark <roclark@nvidia.com>
-
Jared Casper authored
Fixes a bug in broadcasting that was causing hanging See merge request ADLR/megatron-lm!327
-
- 14 Sep, 2021 5 commits
-
-
Ryan Prenger authored
Changing API to fix LM Harness Evaluation and make API less confusing/terrible See merge request ADLR/megatron-lm!326
-
rprenger authored
-
rprenger authored
-
rprenger authored
-
rprenger authored
-
- 13 Sep, 2021 1 commit
-
-
rprenger authored
Changing the interface to the lm eval harness and fixing bugs caused by misunderstanding out_seq_length
-
- 10 Sep, 2021 1 commit
-
-
Jared Casper authored
Reducing memory usage during inference See merge request ADLR/megatron-lm!320
-
- 08 Sep, 2021 2 commits
-
-
Jared Casper authored
This file doesn't work withthe new inference code, so it should be remove See merge request ADLR/megatron-lm!322
-
rprenger authored
-
- 07 Sep, 2021 2 commits
-
-
Jared Casper authored
use low-priority stream for nccl overlapping all-reduce and gemm See merge request ADLR/megatron-lm!319
-
Sangkug Lym authored
-
- 03 Sep, 2021 2 commits
-
-
rprenger authored
-
Jared Casper authored
allreduce overlap with wgrad gemm See merge request ADLR/megatron-lm!316
-