"git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "df5d9c3551a6405feb697a1cad903dddffa04bfe"
Unverified Commit e75cb0cb authored by Shauray Singh's avatar Shauray Singh Committed by GitHub
Browse files

fix type annotations for arguments in training_args (#24550)

* testing

* example script

* fix typehinting

* some tests

* make test

* optional update

* Union of arguments

* does this fix the issue

* remove reports

* set default to False

* documentation change

* None support

* does not need None

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments

* Change dict to Dict

* Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574)

Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)"

This reverts commit c5e29d43

.

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments

* Change dict to Dict

* merge

* hacky fix

* fixup

---------
Co-authored-by: default avatarMax Ryabinin <mryabinin0@gmail.com>
Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
parent 0c41765d
......@@ -406,7 +406,7 @@ class TrainingArguments:
When resuming training, whether or not to skip the epochs and batches to get the data loading at the same
stage as in the previous training. If set to `True`, the training will begin faster (as that skipping step
can take a long time) but will not yield the same results as the interrupted training would have.
sharded_ddp (`bool`, `str` or list of [`~trainer_utils.ShardedDDPOption`], *optional*, defaults to `False`):
sharded_ddp (`bool`, `str` or list of [`~trainer_utils.ShardedDDPOption`], *optional*, defaults to `''`):
Use Sharded DDP training from [FairScale](https://github.com/facebookresearch/fairscale) (in distributed
training only). This is an experimental feature.
......@@ -421,7 +421,7 @@ class TrainingArguments:
If a string is passed, it will be split on space. If a bool is passed, it will be converted to an empty
list for `False` and `["simple"]` for `True`.
fsdp (`bool`, `str` or list of [`~trainer_utils.FSDPOption`], *optional*, defaults to `False`):
fsdp (`bool`, `str` or list of [`~trainer_utils.FSDPOption`], *optional*, defaults to `''`):
Use PyTorch Distributed Parallel Training (in distributed training only).
A list of options along the following:
......@@ -969,7 +969,7 @@ class TrainingArguments:
)
},
)
sharded_ddp: str = field(
sharded_ddp: Optional[Union[List[ShardedDDPOption], str]] = field(
default="",
metadata={
"help": (
......@@ -980,7 +980,7 @@ class TrainingArguments:
),
},
)
fsdp: str = field(
fsdp: Optional[Union[List[FSDPOption], str]] = field(
default="",
metadata={
"help": (
......@@ -1005,8 +1005,8 @@ class TrainingArguments:
default=None,
metadata={
"help": (
"Config to be used with FSDP (Pytorch Fully Sharded Data Parallel). The value is either a"
"fsdp json config file (e.g., `fsdp_config.json`) or an already loaded json file as `dict`."
"Config to be used with FSDP (Pytorch Fully Sharded Data Parallel). The value is either a"
"fsdp json config file (e.g., `fsdp_config.json`) or an already loaded json file as `dict`."
)
},
)
......@@ -1019,11 +1019,11 @@ class TrainingArguments:
)
},
)
deepspeed: Optional[str] = field(
deepspeed: Optional[Union[str, Dict]] = field(
default=None,
metadata={
"help": (
"Enable deepspeed and pass the path to deepspeed json config file (e.g. ds_config.json) or an already"
"Enable deepspeed and pass the path to deepspeed json config file (e.g. `ds_config.json`) or an already"
" loaded json file as a dict"
)
},
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment