Unverified Commit e75cb0cb authored by Shauray Singh's avatar Shauray Singh Committed by GitHub
Browse files

fix type annotations for arguments in training_args (#24550)

* testing

* example script

* fix typehinting

* some tests

* make test

* optional update

* Union of arguments

* does this fix the issue

* remove reports

* set default to False

* documentation change

* None support

* does not need None

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments

* Change dict to Dict

* Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574)

Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)"

This reverts commit c5e29d43

.

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments

* Change dict to Dict

* merge

* hacky fix

* fixup

---------
Co-authored-by: default avatarMax Ryabinin <mryabinin0@gmail.com>
Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
parent 0c41765d
...@@ -406,7 +406,7 @@ class TrainingArguments: ...@@ -406,7 +406,7 @@ class TrainingArguments:
When resuming training, whether or not to skip the epochs and batches to get the data loading at the same When resuming training, whether or not to skip the epochs and batches to get the data loading at the same
stage as in the previous training. If set to `True`, the training will begin faster (as that skipping step stage as in the previous training. If set to `True`, the training will begin faster (as that skipping step
can take a long time) but will not yield the same results as the interrupted training would have. can take a long time) but will not yield the same results as the interrupted training would have.
sharded_ddp (`bool`, `str` or list of [`~trainer_utils.ShardedDDPOption`], *optional*, defaults to `False`): sharded_ddp (`bool`, `str` or list of [`~trainer_utils.ShardedDDPOption`], *optional*, defaults to `''`):
Use Sharded DDP training from [FairScale](https://github.com/facebookresearch/fairscale) (in distributed Use Sharded DDP training from [FairScale](https://github.com/facebookresearch/fairscale) (in distributed
training only). This is an experimental feature. training only). This is an experimental feature.
...@@ -421,7 +421,7 @@ class TrainingArguments: ...@@ -421,7 +421,7 @@ class TrainingArguments:
If a string is passed, it will be split on space. If a bool is passed, it will be converted to an empty If a string is passed, it will be split on space. If a bool is passed, it will be converted to an empty
list for `False` and `["simple"]` for `True`. list for `False` and `["simple"]` for `True`.
fsdp (`bool`, `str` or list of [`~trainer_utils.FSDPOption`], *optional*, defaults to `False`): fsdp (`bool`, `str` or list of [`~trainer_utils.FSDPOption`], *optional*, defaults to `''`):
Use PyTorch Distributed Parallel Training (in distributed training only). Use PyTorch Distributed Parallel Training (in distributed training only).
A list of options along the following: A list of options along the following:
...@@ -969,7 +969,7 @@ class TrainingArguments: ...@@ -969,7 +969,7 @@ class TrainingArguments:
) )
}, },
) )
sharded_ddp: str = field( sharded_ddp: Optional[Union[List[ShardedDDPOption], str]] = field(
default="", default="",
metadata={ metadata={
"help": ( "help": (
...@@ -980,7 +980,7 @@ class TrainingArguments: ...@@ -980,7 +980,7 @@ class TrainingArguments:
), ),
}, },
) )
fsdp: str = field( fsdp: Optional[Union[List[FSDPOption], str]] = field(
default="", default="",
metadata={ metadata={
"help": ( "help": (
...@@ -1005,8 +1005,8 @@ class TrainingArguments: ...@@ -1005,8 +1005,8 @@ class TrainingArguments:
default=None, default=None,
metadata={ metadata={
"help": ( "help": (
"Config to be used with FSDP (Pytorch Fully Sharded Data Parallel). The value is either a" "Config to be used with FSDP (Pytorch Fully Sharded Data Parallel). The value is either a"
"fsdp json config file (e.g., `fsdp_config.json`) or an already loaded json file as `dict`." "fsdp json config file (e.g., `fsdp_config.json`) or an already loaded json file as `dict`."
) )
}, },
) )
...@@ -1019,11 +1019,11 @@ class TrainingArguments: ...@@ -1019,11 +1019,11 @@ class TrainingArguments:
) )
}, },
) )
deepspeed: Optional[str] = field( deepspeed: Optional[Union[str, Dict]] = field(
default=None, default=None,
metadata={ metadata={
"help": ( "help": (
"Enable deepspeed and pass the path to deepspeed json config file (e.g. ds_config.json) or an already" "Enable deepspeed and pass the path to deepspeed json config file (e.g. `ds_config.json`) or an already"
" loaded json file as a dict" " loaded json file as a dict"
) )
}, },
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment