"git@developer.sourcefind.cn:guobj/qwen_lmdeploy.git" did not exist on "c16b857bfbf881923b94f11e02a0d14ad5d9d887"
Unverified Commit 79712e7e authored by Stas Bekman's avatar Stas Bekman Committed by GitHub
Browse files

[deepspeed] docs (#11940)

* deepspeed docs

* cleanup

* cleanup
parent 985d7088
...@@ -1627,6 +1627,34 @@ Here is the `documentation ...@@ -1627,6 +1627,34 @@ Here is the `documentation
<https://www.deepspeed.ai/docs/config-json/#automatic-mixed-precision-amp-training-options>`__. <https://www.deepspeed.ai/docs/config-json/#automatic-mixed-precision-amp-training-options>`__.
Batch Size
=======================================================================================================================
To configure batch size, use:
.. code-block:: json
{
"train_batch_size": "auto",
"train_micro_batch_size_per_gpu": "auto"
}
and the :class:`~transformers.Trainer` will automatically set ``train_micro_batch_size_per_gpu`` to the value of
``args.per_device_train_batch_size`` and ``train_batch_size`` to ``args.world_size * args.per_device_train_batch_size *
args.gradient_accumulation_steps``.
You can also set the values explicitly:
.. code-block:: json
{
"train_batch_size": 12,
"train_micro_batch_size_per_gpu": 4
}
But then you're on your own synchronizing the :class:`~transformers.Trainer` command line arguments and the DeepSpeed
configuration.
Gradient Accumulation Gradient Accumulation
======================================================================================================================= =======================================================================================================================
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment