auto_parallel.rst

Auto Parallelism
====================================================

As the scale of deep-learning models grows larger and larger, distributed training,
or parallelism, is needed. Data parallelism and model parallelism has been designed
to speed up the training and solve memory issues.

In oneflow, SBP signature enables users to configure parallelism policy easily.
However, users still need to specify the SBP property for each operator, or most of them.
Users might spend a couple of days digging into the detail of parallelism and get a
low throughput just because of a slight mistake in the configuration of SBP signature.

.. note::

   It only works on :doc:`graph` mode.


Our strength
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To get rid of all those configurations for SBP signatures, we developed auto parallelism.
Still, configurations of placement are necessary and we have not supported auto placement
yet. If you read this paragraph before you rush into any SBP stuff, then congratulation,
you do not need to learn SBPs. You can start writing your code as you did under CPU mode.
Our auto parallelism would generate a fast strategy customized for your specific models,
the size of parameters, and the number of available GPUs.


How to use auto parallelism?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You just need to simply enable the configuration settings in the model
of :doc:`graph` .

Example::

    import oneflow as flow
    class SubclassGraph(flow.nn.Graph):
        def __init__(self):
            super().__init__() # MUST be called
            # auto parallelism configuration
            self.config.enable_auto_parallel(True)
            # other configurations about auto parallelism
            # ......

        def build(self):
            pass

.. warning::

   If you enable auto parallelism, OneFlow will take care of the SBP configurations
   of operators except for explicit ``to_global`` functions.


Configuration API for auto parallelism
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. currentmodule:: oneflow.nn.graph.graph_config.GraphConfig

.. autosummary::
    :toctree: generated
    :nosignatures:

    enable_auto_parallel
    enable_auto_parallel_ignore_user_sbp_config
    set_auto_parallel_computation_cost_ratio
    set_auto_parallel_wait_time
    enable_auto_parallel_trunk_algo
    enable_auto_parallel_sbp_collector