Clarify and add missing typical_p argument docstring. (#21095)

* Clarify and add missing typical_p docstring. * Make the docstring easier to understand. * Clarify typical_p docstring Accept the suggestion by @stevhliu for paraphrasing the docstring. Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Use the same docstring as in GenerationConfig Follow the suggestion suggested by @stevhliu in the pull request conversation. * Fix docstring spacing. Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

Clarify and add missing typical_p argument docstring. (#21095)
* Clarify and add missing typical_p docstring. * Make the docstring easier to understand. * Clarify typical_p docstring Accept the suggestion by @stevhliu for paraphrasing the docstring. Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Use the same docstring as in GenerationConfig Follow the suggestion suggested by @stevhliu in the pull request conversation. * Fix docstring spacing. Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
8896ebb9 · Sherman Siu · GitHub · f30bcd53 · 8896ebb9 · 8896ebb9
Unverified Commit 8896ebb9 authored Jan 17, 2023 by Sherman Siu Committed by GitHub Jan 17, 2023
Showing with 11 additions and 2 deletions

src/transformers/configuration_utils.py src/transformers/configuration_utils.py +6 -0

src/transformers/generation/configuration_utils.py src/transformers/generation/configuration_utils.py +5 -2

No files found.
--- a/src/transformers/configuration_utils.py
+++ b/src/transformers/configuration_utils.py
@@ -144,6 +144,12 @@ class PretrainedConfig(PushToHubMixin):
        top_p (`float`, *optional*, defaults to 1):
            Value that will be used by default in the `generate` method of the model for `top_p`. If set to float < 1,
            only the most probable tokens with probabilities that add up to `top_p` or higher are kept for generation.
+        typical_p (`float`, *optional*, defaults to 1):
+            Local typicality measures how similar the conditional probability of predicting a target token next is to
+            the expected conditional probability of predicting a random token next, given the partial text already
+            generated. If set to float < 1, the smallest set of the most locally typical tokens with probabilities that
+            add up to `typical_p` or higher are kept for generation. See [this
+            paper](https://arxiv.org/pdf/2202.00666.pdf) for more details.
        repetition_penalty (`float`, *optional*, defaults to 1):
            Parameter for repetition penalty that will be used by default in the `generate` method of the model. 1.0
            means no penalty.

--- a/src/transformers/generation/configuration_utils.py
+++ b/src/transformers/generation/configuration_utils.py
@@ -111,8 +111,11 @@ class GenerationConfig(PushToHubMixin):
            If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to
            `top_p` or higher are kept for generation.
        typical_p (`float`, *optional*, defaults to 1.0):
-            The amount of probability mass from the original distribution to be considered in typical decoding. If set
-            to 1.0 it takes no effect. See [this paper](https://arxiv.org/pdf/2202.00666.pdf) for more details.
+            Local typicality measures how similar the conditional probability of predicting a target token next is to
+            the expected conditional probability of predicting a random token next, given the partial text already
+            generated. If set to float < 1, the smallest set of the most locally typical tokens with probabilities that
+            add up to `typical_p` or higher are kept for generation. See [this
+            paper](https://arxiv.org/pdf/2202.00666.pdf) for more details.
        diversity_penalty (`float`, *optional*, defaults to 0.0):
            This value is subtracted from a beam's score if it generates a token same as any beam from other group at a
            particular time. Note that `diversity_penalty` is only effective if `group beam search` is enabled.