Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
a64f8c1f
Unverified
Commit
a64f8c1f
authored
Oct 26, 2023
by
Jing Hua
Committed by
GitHub
Oct 25, 2023
Browse files
[docstring] fix incorrect llama docstring: encoder -> decoder (#27071)
fix incorrect docstring: encoder -> decoder
parent
0baa9246
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
3 additions
and
3 deletions
+3
-3
src/transformers/models/llama/configuration_llama.py
src/transformers/models/llama/configuration_llama.py
+2
-2
src/transformers/models/llama/modeling_llama.py
src/transformers/models/llama/modeling_llama.py
+1
-1
No files found.
src/transformers/models/llama/configuration_llama.py
View file @
a64f8c1f
...
@@ -47,9 +47,9 @@ class LlamaConfig(PretrainedConfig):
...
@@ -47,9 +47,9 @@ class LlamaConfig(PretrainedConfig):
intermediate_size (`int`, *optional*, defaults to 11008):
intermediate_size (`int`, *optional*, defaults to 11008):
Dimension of the MLP representations.
Dimension of the MLP representations.
num_hidden_layers (`int`, *optional*, defaults to 32):
num_hidden_layers (`int`, *optional*, defaults to 32):
Number of hidden layers in the Transformer e
n
coder.
Number of hidden layers in the Transformer
d
ecoder.
num_attention_heads (`int`, *optional*, defaults to 32):
num_attention_heads (`int`, *optional*, defaults to 32):
Number of attention heads for each attention layer in the Transformer e
n
coder.
Number of attention heads for each attention layer in the Transformer
d
ecoder.
num_key_value_heads (`int`, *optional*):
num_key_value_heads (`int`, *optional*):
This is the number of key_value heads that should be used to implement Grouped Query Attention. If
This is the number of key_value heads that should be used to implement Grouped Query Attention. If
`num_key_value_heads=num_attention_heads`, the model will use Multi Head Attention (MHA), if
`num_key_value_heads=num_attention_heads`, the model will use Multi Head Attention (MHA), if
...
...
src/transformers/models/llama/modeling_llama.py
View file @
a64f8c1f
...
@@ -871,7 +871,7 @@ LLAMA_INPUTS_DOCSTRING = r"""
...
@@ -871,7 +871,7 @@ LLAMA_INPUTS_DOCSTRING = r"""
past_key_values (`tuple(tuple(torch.FloatTensor))`, *optional*, returned when `use_cache=True` is passed or when `config.use_cache=True`):
past_key_values (`tuple(tuple(torch.FloatTensor))`, *optional*, returned when `use_cache=True` is passed or when `config.use_cache=True`):
Tuple of `tuple(torch.FloatTensor)` of length `config.n_layers`, with each tuple having 2 tensors of shape
Tuple of `tuple(torch.FloatTensor)` of length `config.n_layers`, with each tuple having 2 tensors of shape
`(batch_size, num_heads, sequence_length, embed_size_per_head)`) and 2 additional tensors of shape
`(batch_size, num_heads, sequence_length, embed_size_per_head)`) and 2 additional tensors of shape
`(batch_size, num_heads, e
n
coder_sequence_length, embed_size_per_head)`.
`(batch_size, num_heads,
d
ecoder_sequence_length, embed_size_per_head)`.
Contains pre-computed hidden-states (key and values in the self-attention blocks and in the cross-attention
Contains pre-computed hidden-states (key and values in the self-attention blocks and in the cross-attention
blocks) that can be used (see `past_key_values` input) to speed up sequential decoding.
blocks) that can be used (see `past_key_values` input) to speed up sequential decoding.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment