Unverified Commit fe8a2c54 authored by Harry Mellor's avatar Harry Mellor Committed by GitHub
Browse files

[Docs] Improve docstring formatting for `FusedMoEParallelConfig.make` (#21117)


Signed-off-by: default avatarHarry Mellor <19981378+hmellor@users.noreply.github.com>
parent 4ef00b5c
...@@ -192,40 +192,43 @@ class FusedMoEParallelConfig: ...@@ -192,40 +192,43 @@ class FusedMoEParallelConfig:
def make(tp_size_: int, dp_size_: int, def make(tp_size_: int, dp_size_: int,
vllm_parallel_config: ParallelConfig) -> "FusedMoEParallelConfig": vllm_parallel_config: ParallelConfig) -> "FusedMoEParallelConfig":
""" """
Determine MoE parallel configuration. Based on the input tp_size_, Determine MoE parallel configuration. Based on the input `tp_size_`,
dp_size_, ep_size_ and vllm's parallel config, determine what `dp_size_` and vllm's parallel config, determine what
level's of parallelism to use in the fused moe layer. level's of parallelism to use in the fused moe layer.
Args: Args:
tp_size_ (int): tp_size passed into the FusedMoE constructor. tp_size_ (int): `tp_size` passed into the FusedMoE constructor.
dp_size_ (int): dp_size passed into the FusedMoE constructor. dp_size_ (int): `dp_size` passed into the FusedMoE constructor.
ep_size_ (int): ep_size passed into the FusedMoE constructor. vllm_parallel_config (ParallelConfig): vLLM's parallel config
vllm_parallel_config (ParallelConfig): vllm's parallel config object which contains the `enable_expert_parallel` flag.
object.
Examples: Examples:
When there is no parallelism requested, i.e. tp_size_ = dp_size_ = 1, When there is no parallelism requested,
we simply return the sizes unaltered and the ranks set to 0. i.e. `tp_size_` = `dp_size_` = 1, we simply return the sizes
unaltered and the ranks set to 0.
Expert Parallelism is considered only when either dp_size_ or tp_size_ Expert Parallelism is considered only when either `dp_size_` or
is non trivial. `tp_size_` is non trivial.
When TP = 2, DP = 1 and EP = False, the configuration on different When TP = 2, DP = 1 and EP = False, the configuration on different
devices, devices:
- device 0 : TP = {2, 0} DP = {1, 0} EP = {1, 0} // - device 0 : TP = {2, 0} DP = {1, 0} EP = {1, 0} //
legend : {size, rank} legend : {size, rank}
- device 1 : TP = {2, 1} DP = {1, 0} EP = {1, 0} - device 1 : TP = {2, 1} DP = {1, 0} EP = {1, 0}
- Comment : Tensors are sharded across 2 devices. - Comment : Tensors are sharded across 2 devices.
When TP = 1, DP = 2 and EP = False, the configuration on different When TP = 1, DP = 2 and EP = False, the configuration on different
devices, devices:
- device 0 : TP = {2, 0} DP = {2, 0} EP = {1, 0} - device 0 : TP = {2, 0} DP = {2, 0} EP = {1, 0}
- device 1 : TP = {2, 1} DP = {2, 1} EP = {1, 0} - device 1 : TP = {2, 1} DP = {2, 1} EP = {1, 0}
- Comment: There are 2 engine instances and the tensors are sharded - Comment: There are 2 engine instances and the tensors are sharded
across 2 decvices. across 2 decvices.
When TP = 2, DP = 2 and EP = False, the configuration on different When TP = 2, DP = 2 and EP = False, the configuration on different
devices, devices:
- device 0: TP = {4, 0} DP = {2, 0} EP = {1, 0} - device 0: TP = {4, 0} DP = {2, 0} EP = {1, 0}
- device 1: TP = {4, 1} DP = {2, 0} EP = {1, 0} - device 1: TP = {4, 1} DP = {2, 0} EP = {1, 0}
- device 2: TP = {4, 2} DP = {2, 1} EP = {1, 0} - device 2: TP = {4, 2} DP = {2, 1} EP = {1, 0}
...@@ -234,20 +237,23 @@ class FusedMoEParallelConfig: ...@@ -234,20 +237,23 @@ class FusedMoEParallelConfig:
across 4 devices. across 4 devices.
When, TP = 2, DP = 1 and EP = True, the configuration on different When, TP = 2, DP = 1 and EP = True, the configuration on different
devices, devices:
- device 0: TP = {1, 0} DP = {1, 0} EP = {2, 0} - device 0: TP = {1, 0} DP = {1, 0} EP = {2, 0}
- device 1: TP = {1, 0} DP = {1, 0} EP = {2, 1} - device 1: TP = {1, 0} DP = {1, 0} EP = {2, 1}
- Comment: The experts are split between the 2 devices. - Comment: The experts are split between the 2 devices.
When, TP = 1, DP = 2 and EP = True, the configuration on different When, TP = 1, DP = 2 and EP = True, the configuration on different
devices, devices:
- device 0: TP = {1, 0} DP = {2, 0} EP = {2, 0} - device 0: TP = {1, 0} DP = {2, 0} EP = {2, 0}
- device 1: TP = {1, 0} DP = {2, 1} EP = {2, 1} - device 1: TP = {1, 0} DP = {2, 1} EP = {2, 1}
- Comment: There are 2 engine instances and the experts are split - Comment: There are 2 engine instances and the experts are split
between the 2 devices. between the 2 devices.
When TP = 2, DP = 2 and EP = True, the configuration on different When TP = 2, DP = 2 and EP = True, the configuration on different
devices, devices:
- device 0: TP = {1, 0} DP = {2, 0} EP = {4, 0} - device 0: TP = {1, 0} DP = {2, 0} EP = {4, 0}
- device 1: TP = {1, 0} DP = {2, 0} EP = {4, 1} - device 1: TP = {1, 0} DP = {2, 0} EP = {4, 1}
- device 2: TP = {1, 0} DP = {2, 1} EP = {4, 2} - device 2: TP = {1, 0} DP = {2, 1} EP = {4, 2}
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment