Unverified Commit 41c5f45b authored by Abhipsha Das's avatar Abhipsha Das Committed by GitHub
Browse files

[DOCS] Add example for `TopPLogitsWarper` (#25361)

* [DOCS] Add example for `TopPLogitsWarper`

* fix typo

* address review feedback

* address review nits
parent 3a05e010
...@@ -370,6 +370,37 @@ class TopPLogitsWarper(LogitsWarper): ...@@ -370,6 +370,37 @@ class TopPLogitsWarper(LogitsWarper):
All filtered values will be set to this float value. All filtered values will be set to this float value.
min_tokens_to_keep (`int`, *optional*, defaults to 1): min_tokens_to_keep (`int`, *optional*, defaults to 1):
Minimum number of tokens that cannot be filtered. Minimum number of tokens that cannot be filtered.
Examples:
```python
>>> from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed
>>> set_seed(0)
>>> model = AutoModelForCausalLM.from_pretrained("gpt2")
>>> tokenizer = AutoTokenizer.from_pretrained("gpt2")
>>> text = "It is probably one of the most important things for parents to teach children about patience and acceptance. In this way, we as a society can ensure"
>>> inputs = tokenizer(text, return_tensors="pt")
>>> # Generate sequences without top_p sampling
>>> # We see that the answer tends to have a lot of repeated tokens and phrases
>>> outputs = model.generate(**inputs, max_length=55)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
'It is probably one of the most important things for parents to teach children about patience and acceptance. In this way, we as a society can ensure that our children are not taught to be impatient or to be afraid of the future.\n\nThe first step is to teach them'
>>> # Generate sequences with top_p sampling: set `do_sample=True` to use top_p sampling with `top_p` arugment
>>> # We already see that the answer has less repetitive tokens and is more diverse
>>> outputs = model.generate(**inputs, max_length=55, do_sample=True, top_p=0.25)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
'It is probably one of the most important things for parents to teach children about patience and acceptance. In this way, we as a society can ensure that children learn to be more accepting of others and to be more tolerant of others.\n\nWe can also teach children to be'
>>> # Generate sequences with top_p sampling with a larger top_p value
>>> # We see that as we increase the top_p value, less probable tokens also get selected during text generation, making the answer more diverse
>>> # Pro Tip: In practice, we tend to use top_p values between 0.9 and 1.0!
>>> outputs = model.generate(**inputs, max_length=55, do_sample=True, top_p=0.95)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
'It is probably one of the most important things for parents to teach children about patience and acceptance. In this way, we as a society can ensure we have the best learning environment, so that we can teach to learn and not just take advantage of the environment.\n\nThe'
```
""" """
def __init__(self, top_p: float, filter_value: float = -float("Inf"), min_tokens_to_keep: int = 1): def __init__(self, top_p: float, filter_value: float = -float("Inf"), min_tokens_to_keep: int = 1):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment