"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "6faf283288ce3390281ad8c1d37ccb13f2d03990"
Unverified Commit 54833886 authored by Will Frey's avatar Will Frey Committed by GitHub
Browse files

Update modeling_gpt_neox.py (#17575)

I'm guessing that the intention was to have the `_no_split_modules` class attribute for `GPTNeoXPreTrainedModel` to be set to `["GPTNeoXLayer"]`, akin to how its set as `["GPTJBlock"]` for `GPTJPreTrainedModel`.

If this is incorrect, please feel free to just close the PR.

Thanks!
parent a1344dbf
...@@ -53,6 +53,7 @@ class GPTNeoXPreTrainedModel(PreTrainedModel): ...@@ -53,6 +53,7 @@ class GPTNeoXPreTrainedModel(PreTrainedModel):
config_class = GPTNeoXConfig config_class = GPTNeoXConfig
base_model_prefix = "gpt_neox" base_model_prefix = "gpt_neox"
supports_gradient_checkpointing = True supports_gradient_checkpointing = True
_no_split_modules = ["GPTNeoXLayer"]
def _init_weights(self, module): def _init_weights(self, module):
"""Initialize the weights""" """Initialize the weights"""
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment