"vscode:/vscode.git/clone" did not exist on "7405c1c77e4637768ea0ad5d27d8a4d8d67bfb19"
Unverified Commit 54833886 authored by Will Frey's avatar Will Frey Committed by GitHub
Browse files

Update modeling_gpt_neox.py (#17575)

I'm guessing that the intention was to have the `_no_split_modules` class attribute for `GPTNeoXPreTrainedModel` to be set to `["GPTNeoXLayer"]`, akin to how its set as `["GPTJBlock"]` for `GPTJPreTrainedModel`.

If this is incorrect, please feel free to just close the PR.

Thanks!
parent a1344dbf
...@@ -53,6 +53,7 @@ class GPTNeoXPreTrainedModel(PreTrainedModel): ...@@ -53,6 +53,7 @@ class GPTNeoXPreTrainedModel(PreTrainedModel):
config_class = GPTNeoXConfig config_class = GPTNeoXConfig
base_model_prefix = "gpt_neox" base_model_prefix = "gpt_neox"
supports_gradient_checkpointing = True supports_gradient_checkpointing = True
_no_split_modules = ["GPTNeoXLayer"]
def _init_weights(self, module): def _init_weights(self, module):
"""Initialize the weights""" """Initialize the weights"""
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment