Unverified Commit cad9f5c6 authored by Dean Wyatte's avatar Dean Wyatte Committed by GitHub
Browse files

Update docs around mixing hf scheduler with deepspeed optimizer (#28223)

update docs around mixing hf scheduler with deepspeed optimizer
parent 3cefac1d
...@@ -1221,12 +1221,7 @@ Therefore you have two ways to take advantage of this very beneficial feature: ...@@ -1221,12 +1221,7 @@ Therefore you have two ways to take advantage of this very beneficial feature:
### Optimizer and Scheduler ### Optimizer and Scheduler
As long as you don't enable `offload_optimizer` you can mix and match DeepSpeed and HuggingFace schedulers and As long as you don't enable `offload_optimizer` you can mix and match DeepSpeed and HuggingFace schedulers and
optimizers, with the exception of using the combination of HuggingFace scheduler and DeepSpeed optimizer: optimizers.
| Combos | HF Scheduler | DS Scheduler |
|:-------------|:-------------|:-------------|
| HF Optimizer | Yes | Yes |
| DS Optimizer | No | Yes |
It is possible to use a non-DeepSpeed optimizer when `offload_optimizer` is enabled, as long as it has both CPU and It is possible to use a non-DeepSpeed optimizer when `offload_optimizer` is enabled, as long as it has both CPU and
GPU implementation (except LAMB). GPU implementation (except LAMB).
......
...@@ -275,14 +275,7 @@ def deepspeed_optim_sched(trainer, hf_deepspeed_config, args, num_training_steps ...@@ -275,14 +275,7 @@ def deepspeed_optim_sched(trainer, hf_deepspeed_config, args, num_training_steps
config = hf_deepspeed_config.config config = hf_deepspeed_config.config
# Optimizer + Scheduler # Mixing and matching DS schedulers and optimizers is supported unless Offload is enabled in which case it's:
# Currently supported combos:
# 1. DS scheduler + DS optimizer: Yes
# 2. HF scheduler + HF optimizer: Yes
# 3. DS scheduler + HF optimizer: Yes
# 4. HF scheduler + DS optimizer: No
#
# Unless Offload is enabled in which case it's:
# 1. DS scheduler + DS optimizer: Yes # 1. DS scheduler + DS optimizer: Yes
# 2. HF scheduler + HF optimizer: Mostly* # 2. HF scheduler + HF optimizer: Mostly*
# 3. DS scheduler + HF optimizer: Mostly* # 3. DS scheduler + HF optimizer: Mostly*
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment