Unverified Commit 54833886 authored by Will Frey's avatar Will Frey Committed by GitHub
Browse files

Update modeling_gpt_neox.py (#17575)

I'm guessing that the intention was to have the `_no_split_modules` class attribute for `GPTNeoXPreTrainedModel` to be set to `["GPTNeoXLayer"]`, akin to how its set as `["GPTJBlock"]` for `GPTJPreTrainedModel`.

If this is incorrect, please feel free to just close the PR.

Thanks!
parent a1344dbf
......@@ -53,6 +53,7 @@ class GPTNeoXPreTrainedModel(PreTrainedModel):
config_class = GPTNeoXConfig
base_model_prefix = "gpt_neox"
supports_gradient_checkpointing = True
_no_split_modules = ["GPTNeoXLayer"]
def _init_weights(self, module):
"""Initialize the weights"""
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment