Generate: All logits processors are documented and have examples (#27796)

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Generate: All logits processors are documented and have examples (#27796)
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
58e7f9bb · Joao Gante · GitHub · 47500b1d · 58e7f9bb · 58e7f9bb
Unverified Commit 58e7f9bb authored Dec 07, 2023 by Joao Gante Committed by GitHub Dec 07, 2023
4 changed files
--- a/docs/source/en/llm_tutorial.md
+++ b/docs/source/en/llm_tutorial.md
@@ -250,7 +250,7 @@ While the autoregressive generation process is relatively straightforward, makin
 1. [Guide](generation_strategies) on how to control different generation methods, how to set up the generation configuration file, and how to stream the output;
 2. [Guide](chat_templating) on the prompt template for chat LLMs;
 3. [Guide](tasks/prompting) on to get the most of prompt design;
-4. API reference on [`~generation.GenerationConfig`], [`~generation.GenerationMixin.generate`], and [generate-related classes](internal/generation_utils).
+4. API reference on [`~generation.GenerationConfig`], [`~generation.GenerationMixin.generate`], and [generate-related classes](internal/generation_utils). Most of the classes, including the logits processors, have usage examples!

 ### LLM leaderboards


--- a/src/transformers/generation/configuration_utils.py
+++ b/src/transformers/generation/configuration_utils.py
@@ -63,6 +63,14 @@ class GenerationConfig(PushToHubMixin):
    You do not need to call any of the above methods directly. Pass custom parameter values to '.generate()'. To learn
    more about decoding strategies refer to the [text generation strategies guide](../generation_strategies).

+    <Tip>
+
+    A large number of these flags control the logits or the stopping criteria of the generation. Make sure you check
+    the [generate-related classes](https://huggingface.co/docs/transformers/internal/generation_utils) for a full
+    description of the possible manipulations, as well as examples of their usage.
+
+    </Tip>
+
    Arg:
        > Parameters that control the length of the output


--- a/src/transformers/generation/logits_process.py
+++ b/src/transformers/generation/logits_process.py
--- a/src/transformers/generation/utils.py
+++ b/src/transformers/generation/utils.py
@@ -1031,15 +1031,8 @@ class GenerationMixin:
            generation_config.encoder_no_repeat_ngram_size is not None
            and generation_config.encoder_no_repeat_ngram_size > 0
        ):
-            if self.config.is_encoder_decoder:
            processors.append(
-                    EncoderNoRepeatNGramLogitsProcessor(
-                        generation_config.encoder_no_repeat_ngram_size, encoder_input_ids
-                    )
-                )
-            else:
-                raise ValueError(
-                    "It's impossible to use `encoder_no_repeat_ngram_size` with decoder-only architecture"
+                EncoderNoRepeatNGramLogitsProcessor(generation_config.encoder_no_repeat_ngram_size, encoder_input_ids)
            )
        if generation_config.bad_words_ids is not None:
            processors.append(