Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
4157e3cd
Unverified
Commit
4157e3cd
authored
Sep 13, 2022
by
Joao Gante
Committed by
GitHub
Sep 13, 2022
Browse files
new length penalty docstring (#19006)
parent
f89f16a5
Changes
7
Hide whitespace changes
Inline
Side-by-side
Showing
7 changed files
with
40 additions
and
28 deletions
+40
-28
src/transformers/configuration_utils.py
src/transformers/configuration_utils.py
+4
-1
src/transformers/generation_beam_search.py
src/transformers/generation_beam_search.py
+8
-6
src/transformers/generation_tf_utils.py
src/transformers/generation_tf_utils.py
+12
-9
src/transformers/generation_utils.py
src/transformers/generation_utils.py
+4
-3
src/transformers/models/fsmt/configuration_fsmt.py
src/transformers/models/fsmt/configuration_fsmt.py
+4
-1
src/transformers/models/rag/modeling_rag.py
src/transformers/models/rag/modeling_rag.py
+4
-4
src/transformers/models/rag/modeling_tf_rag.py
src/transformers/models/rag/modeling_tf_rag.py
+4
-4
No files found.
src/transformers/configuration_utils.py
View file @
4157e3cd
...
@@ -148,7 +148,10 @@ class PretrainedConfig(PushToHubMixin):
...
@@ -148,7 +148,10 @@ class PretrainedConfig(PushToHubMixin):
Parameter for repetition penalty that will be used by default in the `generate` method of the model. 1.0
Parameter for repetition penalty that will be used by default in the `generate` method of the model. 1.0
means no penalty.
means no penalty.
length_penalty (`float`, *optional*, defaults to 1):
length_penalty (`float`, *optional*, defaults to 1):
Exponential penalty to the length that will be used by default in the `generate` method of the model.
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent to
the sequence length, which in turn is used to divide the score of the sequence. Since the score is the log
likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences, while
`length_penalty` < 0.0 encourages shorter sequences.
no_repeat_ngram_size (`int`, *optional*, defaults to 0) -- Value that will be used by default in the
no_repeat_ngram_size (`int`, *optional*, defaults to 0) -- Value that will be used by default in the
`generate` method of the model for `no_repeat_ngram_size`. If set to int > 0, all ngrams of that size can
`generate` method of the model for `no_repeat_ngram_size`. If set to int > 0, all ngrams of that size can
only occur once.
only occur once.
...
...
src/transformers/generation_beam_search.py
View file @
4157e3cd
...
@@ -138,9 +138,10 @@ class BeamSearchScorer(BeamScorer):
...
@@ -138,9 +138,10 @@ class BeamSearchScorer(BeamScorer):
Defines the device type (*e.g.*, `"cpu"` or `"cuda"`) on which this instance of `BeamSearchScorer` will be
Defines the device type (*e.g.*, `"cpu"` or `"cuda"`) on which this instance of `BeamSearchScorer` will be
allocated.
allocated.
length_penalty (`float`, *optional*, defaults to 1.0):
length_penalty (`float`, *optional*, defaults to 1.0):
Exponential penalty to the length. 1.0 means no penalty. Set to values < 1.0 in order to encourage the
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent to
model to generate shorter sequences, to a value > 1.0 in order to encourage the model to produce longer
the sequence length, which in turn is used to divide the score of the sequence. Since the score is the log
sequences.
likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences, while
`length_penalty` < 0.0 encourages shorter sequences.
do_early_stopping (`bool`, *optional*, defaults to `False`):
do_early_stopping (`bool`, *optional*, defaults to `False`):
Whether to stop the beam search when at least `num_beams` sentences are finished per batch or not.
Whether to stop the beam search when at least `num_beams` sentences are finished per batch or not.
num_beam_hyps_to_keep (`int`, *optional*, defaults to 1):
num_beam_hyps_to_keep (`int`, *optional*, defaults to 1):
...
@@ -405,9 +406,10 @@ class ConstrainedBeamSearchScorer(BeamScorer):
...
@@ -405,9 +406,10 @@ class ConstrainedBeamSearchScorer(BeamScorer):
Defines the device type (*e.g.*, `"cpu"` or `"cuda"`) on which this instance of `BeamSearchScorer` will be
Defines the device type (*e.g.*, `"cpu"` or `"cuda"`) on which this instance of `BeamSearchScorer` will be
allocated.
allocated.
length_penalty (`float`, *optional*, defaults to 1.0):
length_penalty (`float`, *optional*, defaults to 1.0):
Exponential penalty to the length. 1.0 means no penalty. Set to values < 1.0 in order to encourage the
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent to
model to generate shorter sequences, to a value > 1.0 in order to encourage the model to produce longer
the sequence length, which in turn is used to divide the score of the sequence. Since the score is the log
sequences.
likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences, while
`length_penalty` < 0.0 encourages shorter sequences.
do_early_stopping (`bool`, *optional*, defaults to `False`):
do_early_stopping (`bool`, *optional*, defaults to `False`):
Whether to stop the beam search when at least `num_beams` sentences are finished per batch or not.
Whether to stop the beam search when at least `num_beams` sentences are finished per batch or not.
num_beam_hyps_to_keep (`int`, *optional*, defaults to 1):
num_beam_hyps_to_keep (`int`, *optional*, defaults to 1):
...
...
src/transformers/generation_tf_utils.py
View file @
4157e3cd
...
@@ -455,10 +455,10 @@ class TFGenerationMixin:
...
@@ -455,10 +455,10 @@ class TFGenerationMixin:
eos_token_id (`int`, *optional*):
eos_token_id (`int`, *optional*):
The id of the *end-of-sequence* token.
The id of the *end-of-sequence* token.
length_penalty (`float`, *optional*, defaults to 1.0):
length_penalty (`float`, *optional*, defaults to 1.0):
Exponential penalty to the length
. 1.0 means no penalty.
Exponential penalty to the length
that is used with beam-based generation. It is applied as an exponent
to the sequence length, which in turn is used to divide the score of the sequence. Since the score is
Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in
the log likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences,
order to encourage the model to produce long
er sequences.
while `length_penalty` < 0.0 encourages short
er sequences.
no_repeat_ngram_size (`int`, *optional*, defaults to 0):
no_repeat_ngram_size (`int`, *optional*, defaults to 0):
If set to int > 0, all ngrams of that size can only occur once.
If set to int > 0, all ngrams of that size can only occur once.
bad_words_ids(`List[int]`, *optional*):
bad_words_ids(`List[int]`, *optional*):
...
@@ -1419,10 +1419,10 @@ class TFGenerationMixin:
...
@@ -1419,10 +1419,10 @@ class TFGenerationMixin:
eos_token_id (`int`, *optional*):
eos_token_id (`int`, *optional*):
The id of the *end-of-sequence* token.
The id of the *end-of-sequence* token.
length_penalty (`float`, *optional*, defaults to 1.0):
length_penalty (`float`, *optional*, defaults to 1.0):
Exponential penalty to the length
. 1.0 means no penalty.
Exponential penalty to the length
that is used with beam-based generation. It is applied as an exponent
to the sequence length, which in turn is used to divide the score of the sequence. Since the score is
Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in
the log likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences,
order to encourage the model to produce long
er sequences.
while `length_penalty` < 0.0 encourages short
er sequences.
no_repeat_ngram_size (`int`, *optional*, defaults to 0):
no_repeat_ngram_size (`int`, *optional*, defaults to 0):
If set to int > 0, all ngrams of that size can only occur once.
If set to int > 0, all ngrams of that size can only occur once.
bad_words_ids(`List[int]`, *optional*):
bad_words_ids(`List[int]`, *optional*):
...
@@ -2657,7 +2657,10 @@ class TFGenerationMixin:
...
@@ -2657,7 +2657,10 @@ class TFGenerationMixin:
eos_token_id (`int`, *optional*):
eos_token_id (`int`, *optional*):
The id of the *end-of-sequence* token.
The id of the *end-of-sequence* token.
length_penalty (`float`, *optional*, defaults to 1.0):
length_penalty (`float`, *optional*, defaults to 1.0):
Exponential penalty to the length. 1.0 means no penalty.
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent
to the sequence length, which in turn is used to divide the score of the sequence. Since the score is
the log likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences,
while `length_penalty` < 0.0 encourages shorter sequences.
early_stopping (`bool`, *optional*, defaults to `False`):
early_stopping (`bool`, *optional*, defaults to `False`):
Whether to stop the beam search when at least `num_beams` sentences are finished per batch or not.
Whether to stop the beam search when at least `num_beams` sentences are finished per batch or not.
logits_processor (`[TFLogitsProcessorList]`, *optional*):
logits_processor (`[TFLogitsProcessorList]`, *optional*):
...
...
src/transformers/generation_utils.py
View file @
4157e3cd
...
@@ -1005,9 +1005,10 @@ class GenerationMixin:
...
@@ -1005,9 +1005,10 @@ class GenerationMixin:
eos_token_id (`int`, *optional*, defaults to `model.config.eos_token_id`):
eos_token_id (`int`, *optional*, defaults to `model.config.eos_token_id`):
The id of the *end-of-sequence* token.
The id of the *end-of-sequence* token.
length_penalty (`float`, *optional*, defaults to `model.config.length_penalty` or 1.0 if the config does not set any value):
length_penalty (`float`, *optional*, defaults to `model.config.length_penalty` or 1.0 if the config does not set any value):
Exponential penalty to the length. 1.0 means that the beam score is penalized by the sequence length.
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent
0.0 means no penalty. Set to values < 0.0 in order to encourage the model to generate longer
to the sequence length, which in turn is used to divide the score of the sequence. Since the score is
sequences, to a value > 0.0 in order to encourage the model to produce shorter sequences.
the log likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences,
while `length_penalty` < 0.0 encourages shorter sequences.
no_repeat_ngram_size (`int`, *optional*, defaults to `model.config.no_repeat_ngram_size` or 0 if the config does not set any value):
no_repeat_ngram_size (`int`, *optional*, defaults to `model.config.no_repeat_ngram_size` or 0 if the config does not set any value):
If set to int > 0, all ngrams of that size can only occur once.
If set to int > 0, all ngrams of that size can only occur once.
encoder_no_repeat_ngram_size (`int`, *optional*, defaults to `model.config.encoder_no_repeat_ngram_size` or 0 if the config does not set any value):
encoder_no_repeat_ngram_size (`int`, *optional*, defaults to `model.config.encoder_no_repeat_ngram_size` or 0 if the config does not set any value):
...
...
src/transformers/models/fsmt/configuration_fsmt.py
View file @
4157e3cd
...
@@ -107,7 +107,10 @@ class FSMTConfig(PretrainedConfig):
...
@@ -107,7 +107,10 @@ class FSMTConfig(PretrainedConfig):
Number of beams for beam search that will be used by default in the `generate` method of the model. 1 means
Number of beams for beam search that will be used by default in the `generate` method of the model. 1 means
no beam search.
no beam search.
length_penalty (`float`, *optional*, defaults to 1)
length_penalty (`float`, *optional*, defaults to 1)
Exponential penalty to the length that will be used by default in the `generate` method of the model.
Exponential penalty to the length that is used with beam-based generation. It is applied as an exponent to
the sequence length, which in turn is used to divide the score of the sequence. Since the score is the log
likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences, while
`length_penalty` < 0.0 encourages shorter sequences.
early_stopping (`bool`, *optional*, defaults to `False`)
early_stopping (`bool`, *optional*, defaults to `False`)
Flag that will be used by default in the `generate` method of the model. Whether to stop the beam search
Flag that will be used by default in the `generate` method of the model. Whether to stop the beam search
when at least `num_beams` sentences are finished per batch or not.
when at least `num_beams` sentences are finished per batch or not.
...
...
src/transformers/models/rag/modeling_rag.py
View file @
4157e3cd
...
@@ -1463,10 +1463,10 @@ class RagTokenForGeneration(RagPreTrainedModel):
...
@@ -1463,10 +1463,10 @@ class RagTokenForGeneration(RagPreTrainedModel):
eos_token_id (`int`, *optional*):
eos_token_id (`int`, *optional*):
The id of the *end-of-sequence* token.
The id of the *end-of-sequence* token.
length_penalty (`float`, *optional*, defaults to 1.0):
length_penalty (`float`, *optional*, defaults to 1.0):
Exponential penalty to the length
. 1.0 means no penalty.
Exponential penalty to the length
that is used with beam-based generation. It is applied as an exponent
to the sequence length, which in turn is used to divide the score of the sequence. Since the score is
Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in
the log likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences,
order to encourage the model to produce long
er sequences.
while `length_penalty` < 0.0 encourages short
er sequences.
no_repeat_ngram_size (`int`, *optional*, defaults to 0):
no_repeat_ngram_size (`int`, *optional*, defaults to 0):
If set to int > 0, all ngrams of that size can only occur once.
If set to int > 0, all ngrams of that size can only occur once.
encoder_no_repeat_ngram_size (`int`, *optional*, defaults to 0):
encoder_no_repeat_ngram_size (`int`, *optional*, defaults to 0):
...
...
src/transformers/models/rag/modeling_tf_rag.py
View file @
4157e3cd
...
@@ -1054,10 +1054,10 @@ class TFRagTokenForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingLoss
...
@@ -1054,10 +1054,10 @@ class TFRagTokenForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingLoss
eos_token_id (`int`, *optional*):
eos_token_id (`int`, *optional*):
The id of the *end-of-sequence* token.
The id of the *end-of-sequence* token.
length_penalty (`float`, *optional*, defaults to 1.0):
length_penalty (`float`, *optional*, defaults to 1.0):
Exponential penalty to the length
. 1.0 means no penalty.
Exponential penalty to the length
that is used with beam-based generation. It is applied as an exponent
to the sequence length, which in turn is used to divide the score of the sequence. Since the score is
Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in
the log likelihood of the sequence (i.e. negative), `length_penalty` > 0.0 promotes longer sequences,
order to encourage the model to produce long
er sequences.
while `length_penalty` < 0.0 encourages short
er sequences.
no_repeat_ngram_size (`int`, *optional*, defaults to 0):
no_repeat_ngram_size (`int`, *optional*, defaults to 0):
If set to int > 0, all ngrams of that size can only occur once.
If set to int > 0, all ngrams of that size can only occur once.
bad_words_ids(`List[int]`, *optional*):
bad_words_ids(`List[int]`, *optional*):
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment