Unverified Commit 68fc78dd authored by Kirthi Shankar Sivamani's avatar Kirthi Shankar Sivamani Committed by GitHub
Browse files

Remove userbuf docs (#164)


Signed-off-by: default avatarKirthi Shankar Sivamani <ksivamani@nvidia.com>
parent 1a08ba19
...@@ -924,12 +924,6 @@ class TransformerLayer(torch.nn.Module): ...@@ -924,12 +924,6 @@ class TransformerLayer(torch.nn.Module):
`set_tensor_parallel_group(tp_group)` method on the initialized module before the `set_tensor_parallel_group(tp_group)` method on the initialized module before the
forward pass to supply the tensor parallel group needed for tensor and sequence forward pass to supply the tensor parallel group needed for tensor and sequence
parallel collectives. parallel collectives.
ub_bulk_wgrad: bool, default = False
Bulk overlap UserBuffer ReduceScatter | WGRAD GEMM
ub_bulk_dgrad: bool, default = False
Bulk overlap UserBuffer AllGather | DGRAD GEMM
ub_split_ag: bool, default = False
Split pipelined overlap UserBuffer AllGather -> GEMM
Optimization parameters Optimization parameters
----------------------- -----------------------
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment