Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
7e93ce40
Unverified
Commit
7e93ce40
authored
Dec 21, 2023
by
Joao Gante
Committed by
GitHub
Dec 21, 2023
Browse files
Fix `input_embeds` docstring in encoder-decoder architectures (#28168)
parent
4f7806ef
Changes
30
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
166 additions
and
137 deletions
+166
-137
src/transformers/models/bart/modeling_bart.py
src/transformers/models/bart/modeling_bart.py
+10
-9
src/transformers/models/bart/modeling_tf_bart.py
src/transformers/models/bart/modeling_tf_bart.py
+9
-5
src/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py
...ormers/models/bigbird_pegasus/modeling_bigbird_pegasus.py
+10
-9
src/transformers/models/biogpt/modeling_biogpt.py
src/transformers/models/biogpt/modeling_biogpt.py
+5
-4
src/transformers/models/blenderbot/modeling_blenderbot.py
src/transformers/models/blenderbot/modeling_blenderbot.py
+10
-9
src/transformers/models/blenderbot/modeling_tf_blenderbot.py
src/transformers/models/blenderbot/modeling_tf_blenderbot.py
+5
-5
src/transformers/models/blenderbot_small/modeling_blenderbot_small.py
...mers/models/blenderbot_small/modeling_blenderbot_small.py
+10
-9
src/transformers/models/blenderbot_small/modeling_tf_blenderbot_small.py
...s/models/blenderbot_small/modeling_tf_blenderbot_small.py
+5
-5
src/transformers/models/led/modeling_led.py
src/transformers/models/led/modeling_led.py
+5
-5
src/transformers/models/m2m_100/modeling_m2m_100.py
src/transformers/models/m2m_100/modeling_m2m_100.py
+10
-9
src/transformers/models/marian/modeling_tf_marian.py
src/transformers/models/marian/modeling_tf_marian.py
+9
-5
src/transformers/models/mbart/modeling_mbart.py
src/transformers/models/mbart/modeling_mbart.py
+10
-9
src/transformers/models/mbart/modeling_tf_mbart.py
src/transformers/models/mbart/modeling_tf_mbart.py
+9
-5
src/transformers/models/musicgen/modeling_musicgen.py
src/transformers/models/musicgen/modeling_musicgen.py
+10
-8
src/transformers/models/mvp/modeling_mvp.py
src/transformers/models/mvp/modeling_mvp.py
+10
-9
src/transformers/models/nllb_moe/modeling_nllb_moe.py
src/transformers/models/nllb_moe/modeling_nllb_moe.py
+10
-9
src/transformers/models/pegasus/modeling_pegasus.py
src/transformers/models/pegasus/modeling_pegasus.py
+10
-9
src/transformers/models/pegasus/modeling_tf_pegasus.py
src/transformers/models/pegasus/modeling_tf_pegasus.py
+9
-5
src/transformers/models/pegasus_x/modeling_pegasus_x.py
src/transformers/models/pegasus_x/modeling_pegasus_x.py
+5
-4
src/transformers/models/plbart/modeling_plbart.py
src/transformers/models/plbart/modeling_plbart.py
+5
-5
No files found.
src/transformers/models/bart/modeling_bart.py
View file @
7e93ce40
...
@@ -1011,10 +1011,11 @@ BART_INPUTS_DOCSTRING = r"""
...
@@ -1011,10 +1011,11 @@ BART_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
...
@@ -1331,11 +1332,11 @@ class BartDecoder(BartPreTrainedModel):
...
@@ -1331,11 +1332,11 @@ class BartDecoder(BartPreTrainedModel):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`torch.FloatTensor` of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing
inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
`input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more
Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation.
control over how to convert `input_ids` indices into associated vectors
than the model's internal
This is useful if you want more
control over how to convert `input_ids` indices into associated vectors
embedding lookup matrix.
than the model's internal
embedding lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
returned tensors for more detail.
...
...
src/transformers/models/bart/modeling_tf_bart.py
View file @
7e93ce40
...
@@ -715,6 +715,10 @@ BART_INPUTS_DOCSTRING = r"""
...
@@ -715,6 +715,10 @@ BART_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`tf.Tensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
use_cache (`bool`, *optional*, defaults to `True`):
use_cache (`bool`, *optional*, defaults to `True`):
If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
`past_key_values`). Set to `False` during training, `True` during generation
`past_key_values`). Set to `False` during training, `True` during generation
...
@@ -990,11 +994,11 @@ class TFBartDecoder(tf.keras.layers.Layer):
...
@@ -990,11 +994,11 @@ class TFBartDecoder(tf.keras.layers.Layer):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`tf.Tensor` of shape
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `input_ids`
inputs_embeds (`tf.tTensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*):
you can choose to directly pass an embedded representation.
This is useful if you want more control
Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation.
over how to convert `input_ids` indices into associated vectors
than the model's internal embedding
This is useful if you want more control
over how to convert `input_ids` indices into associated vectors
lookup matrix.
than the model's internal embedding
lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
returned tensors for more detail.
...
...
src/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py
View file @
7e93ce40
...
@@ -1678,10 +1678,11 @@ BIGBIRD_PEGASUS_INPUTS_DOCSTRING = r"""
...
@@ -1678,10 +1678,11 @@ BIGBIRD_PEGASUS_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
...
@@ -2136,11 +2137,11 @@ class BigBirdPegasusDecoder(BigBirdPegasusPreTrainedModel):
...
@@ -2136,11 +2137,11 @@ class BigBirdPegasusDecoder(BigBirdPegasusPreTrainedModel):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`torch.FloatTensor` of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing
inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
`input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more
Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation.
control over how to convert `input_ids` indices into associated vectors
than the model's internal
This is useful if you want more
control over how to convert `input_ids` indices into associated vectors
embedding lookup matrix.
than the model's internal
embedding lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
returned tensors for more detail.
...
...
src/transformers/models/biogpt/modeling_biogpt.py
View file @
7e93ce40
...
@@ -396,10 +396,11 @@ BIOGPT_INPUTS_DOCSTRING = r"""
...
@@ -396,10 +396,11 @@ BIOGPT_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
use_cache (`bool`, *optional*):
use_cache (`bool`, *optional*):
If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
`past_key_values`).
`past_key_values`).
...
...
src/transformers/models/blenderbot/modeling_blenderbot.py
View file @
7e93ce40
...
@@ -589,10 +589,11 @@ BLENDERBOT_INPUTS_DOCSTRING = r"""
...
@@ -589,10 +589,11 @@ BLENDERBOT_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
...
@@ -892,11 +893,11 @@ class BlenderbotDecoder(BlenderbotPreTrainedModel):
...
@@ -892,11 +893,11 @@ class BlenderbotDecoder(BlenderbotPreTrainedModel):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`torch.FloatTensor` of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing
inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
`input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more
Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation.
control over how to convert `input_ids` indices into associated vectors
than the model's internal
This is useful if you want more
control over how to convert `input_ids` indices into associated vectors
embedding lookup matrix.
than the model's internal
embedding lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
returned tensors for more detail.
...
...
src/transformers/models/blenderbot/modeling_tf_blenderbot.py
View file @
7e93ce40
...
@@ -949,11 +949,11 @@ class TFBlenderbotDecoder(tf.keras.layers.Layer):
...
@@ -949,11 +949,11 @@ class TFBlenderbotDecoder(tf.keras.layers.Layer):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`tf.Tensor` of shape
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `input_ids`
inputs_embeds (`tf.Tensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*):
you can choose to directly pass an embedded representation.
This is useful if you want more control
Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation.
over how to convert `input_ids` indices into associated vectors
than the model's internal embedding
This is useful if you want more control
over how to convert `input_ids` indices into associated vectors
lookup matrix.
than the model's internal embedding
lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
...
...
src/transformers/models/blenderbot_small/modeling_blenderbot_small.py
View file @
7e93ce40
...
@@ -589,10 +589,11 @@ BLENDERBOT_SMALL_INPUTS_DOCSTRING = r"""
...
@@ -589,10 +589,11 @@ BLENDERBOT_SMALL_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
...
@@ -889,11 +890,11 @@ class BlenderbotSmallDecoder(BlenderbotSmallPreTrainedModel):
...
@@ -889,11 +890,11 @@ class BlenderbotSmallDecoder(BlenderbotSmallPreTrainedModel):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`torch.FloatTensor` of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing
inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
`input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more
Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation.
control over how to convert `input_ids` indices into associated vectors
than the model's internal
This is useful if you want more
control over how to convert `input_ids` indices into associated vectors
embedding lookup matrix.
than the model's internal
embedding lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
returned tensors for more detail.
...
...
src/transformers/models/blenderbot_small/modeling_tf_blenderbot_small.py
View file @
7e93ce40
...
@@ -957,11 +957,11 @@ class TFBlenderbotSmallDecoder(tf.keras.layers.Layer):
...
@@ -957,11 +957,11 @@ class TFBlenderbotSmallDecoder(tf.keras.layers.Layer):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`tf.Tensor` of shape
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `input_ids`
inputs_embeds (`tf.Tensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*):
you can choose to directly pass an embedded representation.
This is useful if you want more control
Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation.
over how to convert `input_ids` indices into associated vectors
than the model's internal embedding
This is useful if you want more control
over how to convert `input_ids` indices into associated vectors
lookup matrix.
than the model's internal embedding
lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
...
...
src/transformers/models/led/modeling_led.py
View file @
7e93ce40
...
@@ -2021,11 +2021,11 @@ class LEDDecoder(LEDPreTrainedModel):
...
@@ -2021,11 +2021,11 @@ class LEDDecoder(LEDPreTrainedModel):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`torch.FloatTensor` of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing
inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
`input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more
Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation.
control over how to convert `input_ids` indices into associated vectors
than the model's internal
This is useful if you want more
control over how to convert `input_ids` indices into associated vectors
embedding lookup matrix.
than the model's internal
embedding lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
returned tensors for more detail.
...
...
src/transformers/models/m2m_100/modeling_m2m_100.py
View file @
7e93ce40
...
@@ -630,10 +630,11 @@ M2M_100_INPUTS_DOCSTRING = r"""
...
@@ -630,10 +630,11 @@ M2M_100_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
...
@@ -931,11 +932,11 @@ class M2M100Decoder(M2M100PreTrainedModel):
...
@@ -931,11 +932,11 @@ class M2M100Decoder(M2M100PreTrainedModel):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`torch.FloatTensor` of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing
inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
`input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more
Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation.
control over how to convert `input_ids` indices into associated vectors
than the model's internal
This is useful if you want more
control over how to convert `input_ids` indices into associated vectors
embedding lookup matrix.
than the model's internal
embedding lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
returned tensors for more detail.
...
...
src/transformers/models/marian/modeling_tf_marian.py
View file @
7e93ce40
...
@@ -688,6 +688,10 @@ MARIAN_INPUTS_DOCSTRING = r"""
...
@@ -688,6 +688,10 @@ MARIAN_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`tf.Tensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
use_cache (`bool`, *optional*, defaults to `True`):
use_cache (`bool`, *optional*, defaults to `True`):
If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
`past_key_values`). Set to `False` during training, `True` during generation
`past_key_values`). Set to `False` during training, `True` during generation
...
@@ -975,11 +979,11 @@ class TFMarianDecoder(tf.keras.layers.Layer):
...
@@ -975,11 +979,11 @@ class TFMarianDecoder(tf.keras.layers.Layer):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`tf.Tensor` of shape
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `input_ids`
inputs_embeds (`tf.Tensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*):
you can choose to directly pass an embedded representation.
This is useful if you want more control
Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation.
over how to convert `input_ids` indices into associated vectors
than the model's internal embedding
This is useful if you want more control
over how to convert `input_ids` indices into associated vectors
lookup matrix.
than the model's internal embedding
lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
...
...
src/transformers/models/mbart/modeling_mbart.py
View file @
7e93ce40
...
@@ -876,10 +876,11 @@ MBART_INPUTS_DOCSTRING = r"""
...
@@ -876,10 +876,11 @@ MBART_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
...
@@ -1191,11 +1192,11 @@ class MBartDecoder(MBartPreTrainedModel):
...
@@ -1191,11 +1192,11 @@ class MBartDecoder(MBartPreTrainedModel):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`torch.FloatTensor` of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing
inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
`input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more
Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation.
control over how to convert `input_ids` indices into associated vectors
than the model's internal
This is useful if you want more
control over how to convert `input_ids` indices into associated vectors
embedding lookup matrix.
than the model's internal
embedding lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
returned tensors for more detail.
...
...
src/transformers/models/mbart/modeling_tf_mbart.py
View file @
7e93ce40
...
@@ -636,6 +636,10 @@ MBART_INPUTS_DOCSTRING = r"""
...
@@ -636,6 +636,10 @@ MBART_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`tf.Tensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
use_cache (`bool`, *optional*, defaults to `True`):
use_cache (`bool`, *optional*, defaults to `True`):
If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
`past_key_values`). Set to `False` during training, `True` during generation
`past_key_values`). Set to `False` during training, `True` during generation
...
@@ -981,11 +985,11 @@ class TFMBartDecoder(tf.keras.layers.Layer):
...
@@ -981,11 +985,11 @@ class TFMBartDecoder(tf.keras.layers.Layer):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`tf.Tensor` of shape
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `input_ids`
inputs_embeds (`tf.Tensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*):
you can choose to directly pass an embedded representation.
This is useful if you want more control
Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation.
over how to convert `input_ids` indices into associated vectors
than the model's internal embedding
This is useful if you want more control
over how to convert `input_ids` indices into associated vectors
lookup matrix.
than the model's internal embedding
lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
...
...
src/transformers/models/musicgen/modeling_musicgen.py
View file @
7e93ce40
...
@@ -539,10 +539,11 @@ MUSICGEN_INPUTS_DOCSTRING = r"""
...
@@ -539,10 +539,11 @@ MUSICGEN_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
...
@@ -626,10 +627,11 @@ MUSICGEN_DECODER_INPUTS_DOCSTRING = r"""
...
@@ -626,10 +627,11 @@ MUSICGEN_DECODER_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under returned
Whether or not to return the attentions tensors of all attention layers. See `attentions` under returned
tensors for more detail.
tensors for more detail.
...
...
src/transformers/models/mvp/modeling_mvp.py
View file @
7e93ce40
...
@@ -629,10 +629,11 @@ MVP_INPUTS_DOCSTRING = r"""
...
@@ -629,10 +629,11 @@ MVP_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
...
@@ -1067,11 +1068,11 @@ class MvpDecoder(MvpPreTrainedModel):
...
@@ -1067,11 +1068,11 @@ class MvpDecoder(MvpPreTrainedModel):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`torch.FloatTensor` of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing
inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
`input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more
Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation.
control over how to convert `input_ids` indices into associated vectors
than the model's internal
This is useful if you want more
control over how to convert `input_ids` indices into associated vectors
embedding lookup matrix.
than the model's internal
embedding lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
returned tensors for more detail.
...
...
src/transformers/models/nllb_moe/modeling_nllb_moe.py
View file @
7e93ce40
...
@@ -943,10 +943,11 @@ NLLB_MOE_INPUTS_DOCSTRING = r"""
...
@@ -943,10 +943,11 @@ NLLB_MOE_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
...
@@ -1271,11 +1272,11 @@ class NllbMoeDecoder(NllbMoePreTrainedModel):
...
@@ -1271,11 +1272,11 @@ class NllbMoeDecoder(NllbMoePreTrainedModel):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`torch.FloatTensor` of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing
inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
`input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more
Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation.
control over how to convert `input_ids` indices into associated vectors
than the model's internal
This is useful if you want more
control over how to convert `input_ids` indices into associated vectors
embedding lookup matrix.
than the model's internal
embedding lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
returned tensors for more detail.
...
...
src/transformers/models/pegasus/modeling_pegasus.py
View file @
7e93ce40
...
@@ -584,10 +584,11 @@ PEGASUS_INPUTS_DOCSTRING = r"""
...
@@ -584,10 +584,11 @@ PEGASUS_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
...
@@ -946,11 +947,11 @@ class PegasusDecoder(PegasusPreTrainedModel):
...
@@ -946,11 +947,11 @@ class PegasusDecoder(PegasusPreTrainedModel):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`torch.FloatTensor` of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing
inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
`input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more
Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation.
control over how to convert `input_ids` indices into associated vectors
than the model's internal
This is useful if you want more
control over how to convert `input_ids` indices into associated vectors
embedding lookup matrix.
than the model's internal
embedding lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
returned tensors for more detail.
...
...
src/transformers/models/pegasus/modeling_tf_pegasus.py
View file @
7e93ce40
...
@@ -688,6 +688,10 @@ PEGASUS_INPUTS_DOCSTRING = r"""
...
@@ -688,6 +688,10 @@ PEGASUS_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`tf.Tensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
use_cache (`bool`, *optional*, defaults to `True`):
use_cache (`bool`, *optional*, defaults to `True`):
If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
`past_key_values`). Set to `False` during training, `True` during generation output_attentions (`bool`,
`past_key_values`). Set to `False` during training, `True` during generation output_attentions (`bool`,
...
@@ -985,11 +989,11 @@ class TFPegasusDecoder(tf.keras.layers.Layer):
...
@@ -985,11 +989,11 @@ class TFPegasusDecoder(tf.keras.layers.Layer):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`tf.Tensor` of shape
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `input_ids`
inputs_embeds (`tf.Tensor` of shape
`(batch_size, sequence_length, hidden_size)`, *optional*):
you can choose to directly pass an embedded representation.
This is useful if you want more control
Optionally, instead of passing `input_ids`
you can choose to directly pass an embedded representation.
over how to convert `input_ids` indices into associated vectors
than the model's internal embedding
This is useful if you want more control
over how to convert `input_ids` indices into associated vectors
lookup matrix.
than the model's internal embedding
lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
returned tensors for more detail. This argument can be used only in eager mode, in graph mode the value
...
...
src/transformers/models/pegasus_x/modeling_pegasus_x.py
View file @
7e93ce40
...
@@ -840,10 +840,11 @@ PEGASUS_X_INPUTS_DOCSTRING = r"""
...
@@ -840,10 +840,11 @@ PEGASUS_X_INPUTS_DOCSTRING = r"""
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
`decoder_input_ids` of shape `(batch_size, sequence_length)`. inputs_embeds (`torch.FloatTensor` of shape
`decoder_input_ids` of shape `(batch_size, sequence_length)`.
`(batch_size, sequence_length, hidden_size)`, *optional*): Optionally, instead of passing `input_ids` you
inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
can choose to directly pass an embedded representation. This is useful if you want more control over how to
Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation.
convert `input_ids` indices into associated vectors than the model's internal embedding lookup matrix.
This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix.
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
decoder_inputs_embeds (`torch.FloatTensor` of shape `(batch_size, target_sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
Optionally, instead of passing `decoder_input_ids` you can choose to directly pass an embedded
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
representation. If `past_key_values` is used, optionally only the last `decoder_inputs_embeds` have to be
...
...
src/transformers/models/plbart/modeling_plbart.py
View file @
7e93ce40
...
@@ -938,11 +938,11 @@ class PLBartDecoder(PLBartPreTrainedModel):
...
@@ -938,11 +938,11 @@ class PLBartDecoder(PLBartPreTrainedModel):
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
that don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
inputs_embeds (`torch.FloatTensor` of
all `decoder_input_ids` of shape `(batch_size, sequence_length)`.
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
Optionally, instead of passing
inputs_embeds (`torch.FloatTensor` of
shape `(batch_size, sequence_length, hidden_size)`, *optional*):
`input_ids` you can choose to directly pass an embedded representation.
This is useful if you want more
Optionally, instead of passing
`input_ids` you can choose to directly pass an embedded representation.
control over how to convert `input_ids` indices into associated vectors
than the model's internal
This is useful if you want more
control over how to convert `input_ids` indices into associated vectors
embedding lookup matrix.
than the model's internal
embedding lookup matrix.
output_attentions (`bool`, *optional*):
output_attentions (`bool`, *optional*):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more detail.
returned tensors for more detail.
...
...
Prev
1
2
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment