Commit 2a34e0ec authored by Jared Casper's avatar Jared Casper
Browse files

Merge branch 'patch-1' of https://github.com/vycezhong/Megatron-LM into github-pr

parents 34f55429 30abf2c5
...@@ -240,7 +240,7 @@ class ColumnParallelLinear(torch.nn.Module): ...@@ -240,7 +240,7 @@ class ColumnParallelLinear(torch.nn.Module):
input_size: first dimension of matrix A. input_size: first dimension of matrix A.
output_size: second dimension of matrix A. output_size: second dimension of matrix A.
bias: If true, add bias bias: If true, add bias
gather_output: If true, call all-gether on output and make Y avaiable gather_output: If true, call all-gather on output and make Y avaiable
to all GPUs, otherwise, every GPU will have its output to all GPUs, otherwise, every GPU will have its output
which is Y_i = XA_i which is Y_i = XA_i
init_method: method to initialize weights. Note that bias is always set init_method: method to initialize weights. Note that bias is always set
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment