Remove ConversationalPipeline and Conversation object (#31165)

* Remove ConversationalPipeline and Conversation object, as they have been deprecated for some time and are due for removal * Update not-doctested.txt * Fix JA and ZH docs * Fix JA and ZH docs some more * Fix JA and ZH docs some more

Remove ConversationalPipeline and Conversation object (#31165)
* Remove ConversationalPipeline and Conversation object, as they have been deprecated for some time and are due for removal * Update not-doctested.txt * Fix JA and ZH docs * Fix JA and ZH docs some more * Fix JA and ZH docs some more
065729a6 · Matt · GitHub · 3a100582 · 065729a6 · 065729a6
Unverified Commit 065729a6 authored Jun 07, 2024 by Matt Committed by GitHub Jun 07, 2024
20 changed files
--- a/docs/source/en/main_classes/pipelines.md
+++ b/docs/source/en/main_classes/pipelines.md
@@ -386,14 +386,6 @@ Pipelines available for computer vision tasks include the following.

 Pipelines available for natural language processing tasks include the following.

-### ConversationalPipeline
-
-[[autodoc]] Conversation
-
-[[autodoc]] ConversationalPipeline
-    - __call__
-    - all
-
 ### FillMaskPipeline

 [[autodoc]] FillMaskPipeline

--- a/docs/source/ja/chat_templating.md
+++ b/docs/source/ja/chat_templating.md
@@ -180,8 +180,8 @@ tokenizer.chat_template = template  # Set the new template
 tokenizer.push_to_hub("model_name")  # Upload your new template to the Hub!
 ```

-[`~PreTrainedTokenizer.apply_chat_template`] メソッドは、あなたのチャットテンプレートを使用するために [`ConversationalPipeline`] クラスによって呼び出されます。
-したがって、正しいチャットテンプレートを設定すると、あなたのモデルは自動的に [`ConversationalPipeline`] と互換性があるようになります。
+[`~PreTrainedTokenizer.apply_chat_template`] メソッドは、あなたのチャットテンプレートを使用するために `TextGenerationPipeline` クラスによって呼び出されます。
+したがって、正しいチャットテンプレートを設定すると、あなたのモデルは自動的に [`TextGenerationPipeline`] と互換性があるようになります。


 ## What are "default" templates?
@@ -189,7 +189,7 @@ tokenizer.push_to_hub("model_name")  # Upload your new template to the Hub!
 チャットテンプレートの導入前に、チャットの処理はモデルクラスレベルでハードコードされていました。
 後方互換性のために、このクラス固有の処理をデフォルトテンプレートとして保持し、クラスレベルで設定されています。
 モデルにチャットテンプレートが設定されていない場合、ただしモデルクラスのデフォルトテンプレートがある場合、
-`ConversationalPipeline`クラスや`apply_chat_template`などのメソッドはクラステンプレートを使用します。
+`TextGenerationPipeline`クラスや`apply_chat_template`などのメソッドはクラステンプレートを使用します。
 トークナイザのデフォルトのチャットテンプレートを確認するには、`tokenizer.default_chat_template`属性をチェックしてください。

 これは、後方互換性のために純粋に行っていることで、既存のワークフローを壊さないようにしています。
@@ -233,7 +233,7 @@ I'm doing great!<|im_end|>
 ```

 「ユーザー」、「システム」、および「アシスタント」の役割は、チャットの標準です。
-特に、[`ConversationalPipeline`]との連携をスムーズに行う場合には、これらの役割を使用することをお勧めします。ただし、これらの役割に制約はありません。テンプレートは非常に柔軟で、任意の文字列を役割として使用できます。
+特に、`TextGenerationPipeline`との連携をスムーズに行う場合には、これらの役割を使用することをお勧めします。ただし、これらの役割に制約はありません。テンプレートは非常に柔軟で、任意の文字列を役割として使用できます。

 ## I want to use chat templates! How should I get started?

@@ -242,7 +242,7 @@ I'm doing great!<|im_end|>
 この属性を適切に設定できるように[プルリクエスト](https://huggingface.co/docs/hub/repositories-pull-requests-discussions)を開いてください。

 一度属性が設定されれば、それで完了です！ `tokenizer.apply_chat_template`は、そのモデルに対して正しく動作するようになります。これは、
-`ConversationalPipeline`などの場所でも自動的にサポートされます。
+`TextGenerationPipeline` などの場所でも自動的にサポートされます。

 モデルがこの属性を持つことを確認することで、オープンソースモデルの全コミュニティがそのフルパワーを使用できるようになります。
 フォーマットの不一致はこの分野に悩み続け、パフォーマンスに黙って影響を与えてきました。それを終わらせる時が来ました！

--- a/docs/source/ja/main_classes/pipelines.md
+++ b/docs/source/ja/main_classes/pipelines.md
@@ -388,14 +388,6 @@ my_pipeline = pipeline(model="xxxx", pipeline_class=MyPipeline)

 自然言語処理タスクに使用できるパイプラインには次のものがあります。

-### ConversationalPipeline
-
-[[autodoc]] Conversation
-
-[[autodoc]] ConversationalPipeline
-    - __call__
-    - all
-
 ### FillMaskPipeline

 [[autodoc]] FillMaskPipeline

--- a/docs/source/zh/chat_templating.md
+++ b/docs/source/zh/chat_templating.md
@@ -117,12 +117,12 @@ Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopte

 ## 有自动化的聊天`pipeline`吗？

-有的，[`ConversationalPipeline`]。这个`pipeline`的设计是为了方便使用聊天模型。让我们再试一次 Zephyr 的例子，但这次使用`pipeline`：
+有的，[`TextGenerationPipeline`]。这个`pipeline`的设计是为了方便使用聊天模型。让我们再试一次 Zephyr 的例子，但这次使用`pipeline`：

 ```python
 from transformers import pipeline

-pipe = pipeline("conversational", "HuggingFaceH4/zephyr-7b-beta")
+pipe = pipeline("text-generation", "HuggingFaceH4/zephyr-7b-beta")
 messages = [
    {
        "role": "system",
@@ -130,17 +130,14 @@ messages = [
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
 ]
-print(pipe(messages))
+print(pipe(messages, max_new_tokens=256)['generated_text'][-1])
 ```

 ```text
-Conversation id: 76d886a0-74bd-454e-9804-0467041a63dc
-system: You are a friendly chatbot who always responds in the style of a pirate
-user: How many helicopters can a human eat in one sitting?
-assistant: Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all.
+{'role': 'assistant', 'content': "Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all."}
 ```

-[`ConversationalPipeline`]将负责处理所有的`tokenized`并调用`apply_chat_template`，一旦模型有了聊天模板，您只需要初始化pipeline并传递消息列表！
+[`TextGenerationPipeline`]将负责处理所有的`tokenized`并调用`apply_chat_template`，一旦模型有了聊天模板，您只需要初始化pipeline并传递消息列表！

 ## 什么是"generation prompts"?

@@ -317,12 +314,12 @@ tokenizer.chat_template = template  # Set the new template
 tokenizer.push_to_hub("model_name")  # Upload your new template to the Hub!
 ```

-由于[`~PreTrainedTokenizer.apply_chat_template`]方法是由[`ConversationalPipeline`]类调用，
-因此一旦你设置了聊天模板，您的模型将自动与[`ConversationalPipeline`]兼容。
+由于[`~PreTrainedTokenizer.apply_chat_template`]方法是由[`TextGenerationPipeline`]类调用，
+因此一旦你设置了聊天模板，您的模型将自动与[`TextGenerationPipeline`]兼容。
 ### “默认”模板是什么？

 在引入聊天模板（chat_template）之前，聊天prompt是在模型中通过硬编码处理的。为了向前兼容，我们保留了这种硬编码处理聊天prompt的方法。
-如果一个模型没有设置聊天模板，但其模型有默认模板，`ConversationalPipeline`类和`apply_chat_template`等方法将使用该模型的聊天模板。
+如果一个模型没有设置聊天模板，但其模型有默认模板，`TextGenerationPipeline`类和`apply_chat_template`等方法将使用该模型的聊天模板。
 您可以通过检查`tokenizer.default_chat_template`属性来查找`tokenizer`的默认模板。

 这是我们纯粹为了向前兼容性而做的事情，以避免破坏任何现有的工作流程。即使默认的聊天模板适用于您的模型，
@@ -367,7 +364,7 @@ How are you?<|im_end|>
 I'm doing great!<|im_end|>
 ```

-`user`，`system`和`assistant`是对话助手模型的标准角色，如果您的模型要与[`ConversationalPipeline`]兼容，我们建议你使用这些角色。
+`user`，`system`和`assistant`是对话助手模型的标准角色，如果您的模型要与[`TextGenerationPipeline`]兼容，我们建议你使用这些角色。
 但您可以不局限于这些角色，模板非常灵活，任何字符串都可以成为角色。

 ### 如何添加聊天模板？
@@ -378,7 +375,7 @@ I'm doing great!<|im_end|>
 请发起一个[pull request](https://huggingface.co/docs/hub/repositories-pull-requests-discussions)，以便正确设置该属性！

 一旦属性设置完成，就完成了！`tokenizer.apply_chat_template`现在将在该模型中正常工作，
-这意味着它也会自动支持在诸如`ConversationalPipeline`的地方！
+这意味着它也会自动支持在诸如`TextGenerationPipeline`的地方！

 通过确保模型具有这一属性，我们可以确保整个社区都能充分利用开源模型的全部功能。
 格式不匹配已经困扰这个领域并悄悄地损害了性能太久了，是时候结束它们了！

--- a/docs/source/zh/main_classes/pipelines.md
+++ b/docs/source/zh/main_classes/pipelines.md
@@ -362,14 +362,6 @@ my_pipeline = pipeline(model="xxxx", pipeline_class=MyPipeline)

 可用于自然语言处理任务的pipeline包括以下几种。

-### ConversationalPipeline
-
-[[autodoc]] Conversation
-
-[[autodoc]] ConversationalPipeline
-    - __call__
-    - all
-
 ### FillMaskPipeline

 [[autodoc]] FillMaskPipeline

--- a/src/transformers/__init__.py
+++ b/src/transformers/__init__.py
@@ -799,8 +799,6 @@ _import_structure = {
    "pipelines": [
        "AudioClassificationPipeline",
        "AutomaticSpeechRecognitionPipeline",
-        "Conversation",
-        "ConversationalPipeline",
        "CsvPipelineDataFormat",
        "DepthEstimationPipeline",
        "DocumentQuestionAnsweringPipeline",
@@ -5428,8 +5426,6 @@ if TYPE_CHECKING:
    from .pipelines import (
        AudioClassificationPipeline,
        AutomaticSpeechRecognitionPipeline,
-        Conversation,
-        ConversationalPipeline,
        CsvPipelineDataFormat,
        DepthEstimationPipeline,
        DocumentQuestionAnsweringPipeline,

--- a/src/transformers/models/cohere/tokenization_cohere_fast.py
+++ b/src/transformers/models/cohere/tokenization_cohere_fast.py
@@ -20,7 +20,6 @@ from typing import Dict, List, Literal, Union

 from tokenizers import processors

-from ...pipelines.conversational import Conversation
 from ...tokenization_utils_base import BatchEncoding
 from ...tokenization_utils_fast import PreTrainedTokenizerFast
 from ...utils import logging
@@ -413,7 +412,7 @@ class CohereTokenizerFast(PreTrainedTokenizerFast):

    def apply_tool_use_template(
        self,
-        conversation: Union[List[Dict[str, str]], "Conversation"],
+        conversation: Union[List[Dict[str, str]]],
        tools: List[Dict],
        **kwargs,
    ) -> Union[str, List[int]]:
@@ -424,13 +423,13 @@ class CohereTokenizerFast(PreTrainedTokenizerFast):

        Conceptually, this works in the same way as `apply_chat_format`, but takes an additional `tools` parameter.

-        Converts a Conversation object or a list of dictionaries with `"role"` and `"content"` keys and a list of available
+        Converts a chat in the form of a list of dictionaries with `"role"` and `"content"` keys and a list of available
        tools for the model to use into a prompt string, or a list of token ids.
        This method will use the tokenizer's `default_tool_use_template` template specified at the class level.
        You can override the default template using the `tool_use_template` kwarg but the quality of your results may decrease.

        Args:
-            conversation (Union[List[Dict[str, str]], "Conversation"]): A Conversation object or list of dicts
+            conversation (Union[List[Dict[str, str]]]): A list of dicts
                with "role" and "content" keys, representing the chat history so far.
            tools (List[Dict]): a list of tools to render into the prompt for the model to choose from.
                See an example at the bottom of the docstring.
@@ -568,7 +567,7 @@ class CohereTokenizerFast(PreTrainedTokenizerFast):

    def apply_grounded_generation_template(
        self,
-        conversation: Union[List[Dict[str, str]], "Conversation"],
+        conversation: Union[List[Dict[str, str]]],
        documents: List[Dict],
        citation_mode: Literal["fast", "accurate"] = "accurate",
        **kwargs,
@@ -580,13 +579,13 @@ class CohereTokenizerFast(PreTrainedTokenizerFast):
        Conceptually, this works in the same way as `apply_chat_format`, but takes additional `documents`
        and parameter `citation_mode` parameters.

-        Converts a Conversation object or a list of dictionaries with `"role"` and `"content"` keys and a list of
+        Converts a list of dictionaries with `"role"` and `"content"` keys and a list of
        documents for the model to ground its response on into a prompt string, or a list of token ids.
        This method will use the tokenizer's `grounded_generation_template` template specified at the class level.
        You can override the default template using the `grounded_generation_template` kwarg but the quality of your results may decrease.

        Args:
-            conversation (Union[List[Dict[str, str]], "Conversation"]): A Conversation object or list of dicts
+            conversation (Union[List[Dict[str, str]]]): A list of dicts
                with "role" and "content" keys, representing the chat history so far.
            documents (List[Dict[str, str]): A list of dicts, representing documents or tool outputs to ground your
                generation on. A document is a semistructured dict, wiht a string to string mapping. Common fields are

--- a/src/transformers/models/idefics2/processing_idefics2.py
+++ b/src/transformers/models/idefics2/processing_idefics2.py
@@ -26,7 +26,6 @@ from ...utils import TensorType, logging


 if TYPE_CHECKING:
-    from ...pipelines.conversational import Conversation
    from ...tokenization_utils_base import PreTokenizedInput


@@ -255,7 +254,7 @@ class Idefics2Processor(ProcessorMixin):

    def apply_chat_template(
        self,
-        conversation: Union[List[Dict[str, str]], "Conversation"],
+        conversation: Union[List[Dict[str, str]]],
        chat_template: Optional[str] = None,
        tokenize: bool = False,
        **kwargs,
@@ -269,7 +268,7 @@ class Idefics2Processor(ProcessorMixin):
        tokens to the sequence length or adding the surrounding tokens e.g. <fake_image_token>.

        Args:
-            conversation (`Union[List[Dict, str, str], "Conversation"]`):
+            conversation (`Union[List[Dict, str, str]]`):
                The conversation to format.
            chat_template (`Optional[str]`, *optional*):
                The Jinja template to use for formatting the conversation. If not provided, the default chat template

--- a/src/transformers/pipelines/__init__.py
+++ b/src/transformers/pipelines/__init__.py
@@ -58,7 +58,6 @@ from .base import (
    get_default_model_and_revision,
    infer_framework_load_model,
 )
-from .conversational import Conversation, ConversationalPipeline
 from .depth_estimation import DepthEstimationPipeline
 from .document_question_answering import DocumentQuestionAnsweringPipeline
 from .feature_extraction import FeatureExtractionPipeline
@@ -340,15 +339,6 @@ SUPPORTED_TASKS = {
        },
        "type": "multimodal",
    },
-    "conversational": {
-        "impl": ConversationalPipeline,
-        "tf": (TFAutoModelForSeq2SeqLM, TFAutoModelForCausalLM) if is_tf_available() else (),
-        "pt": (AutoModelForSeq2SeqLM, AutoModelForCausalLM) if is_torch_available() else (),
-        "default": {
-            "model": {"pt": ("microsoft/DialoGPT-medium", "8bada3b"), "tf": ("microsoft/DialoGPT-medium", "8bada3b")}
-        },
-        "type": "text",
-    },
    "image-classification": {
        "impl": ImageClassificationPipeline,
        "tf": (TFAutoModelForImageClassification,) if is_tf_available() else (),
@@ -593,7 +583,6 @@ def pipeline(

            - `"audio-classification"`: will return a [`AudioClassificationPipeline`].
            - `"automatic-speech-recognition"`: will return a [`AutomaticSpeechRecognitionPipeline`].
-            - `"conversational"`: will return a [`ConversationalPipeline`].
            - `"depth-estimation"`: will return a [`DepthEstimationPipeline`].
            - `"document-question-answering"`: will return a [`DocumentQuestionAnsweringPipeline`].
            - `"feature-extraction"`: will return a [`FeatureExtractionPipeline`].

--- a/src/transformers/pipelines/conversational.py
+++ b/src/transformers/pipelines/conversational.py
-import uuid
-import warnings
-from typing import Any, Dict, List, Union
-
-from ..utils import add_end_docstrings, is_tf_available, is_torch_available, logging
-from .base import Pipeline, build_pipeline_init_args
-
-
-if is_tf_available():
-    import tensorflow as tf
-
-if is_torch_available():
-    import torch
-
-
-logger = logging.get_logger(__name__)
-
-
-class Conversation:
-    """
-    Utility class containing a conversation and its history. This class is meant to be used as an input to the
-    [`ConversationalPipeline`]. The conversation contains several utility functions to manage the addition of new user
-    inputs and generated model responses.
-
-    Arguments:
-        messages (Union[str, List[Dict[str, str]]], *optional*):
-            The initial messages to start the conversation, either a string, or a list of dicts containing "role" and
-            "content" keys. If a string is passed, it is interpreted as a single message with the "user" role.
-        conversation_id (`uuid.UUID`, *optional*):
-            Unique identifier for the conversation. If not provided, a random UUID4 id will be assigned to the
-            conversation.
-
-    Usage:
-
-    ```python
-    conversation = Conversation("Going to the movies tonight - any suggestions?")
-    conversation.add_message({"role": "assistant", "content": "The Big lebowski."})
-    conversation.add_message({"role": "user", "content": "Is it good?"})
-    ```"""
-
-    def __init__(
-        self, messages: Union[str, List[Dict[str, str]]] = None, conversation_id: uuid.UUID = None, **deprecated_kwargs
-    ):
-        if not conversation_id:
-            conversation_id = uuid.uuid4()
-
-        if messages is None:
-            text = deprecated_kwargs.pop("text", None)
-            if text is not None:
-                messages = [{"role": "user", "content": text}]
-            else:
-                messages = []
-        elif isinstance(messages, str):
-            messages = [{"role": "user", "content": messages}]
-
-        # This block deals with the legacy args - new code should just totally
-        # avoid past_user_inputs and generated_responses
-        self._num_processed_user_inputs = 0
-        generated_responses = deprecated_kwargs.pop("generated_responses", None)
-        past_user_inputs = deprecated_kwargs.pop("past_user_inputs", None)
-        if generated_responses is not None and past_user_inputs is None:
-            raise ValueError("generated_responses cannot be passed without past_user_inputs!")
-        if past_user_inputs is not None:
-            legacy_messages = []
-            if generated_responses is None:
-                generated_responses = []
-            # We structure it this way instead of using zip() because the lengths may differ by 1
-            for i in range(max([len(past_user_inputs), len(generated_responses)])):
-                if i < len(past_user_inputs):
-                    legacy_messages.append({"role": "user", "content": past_user_inputs[i]})
-                if i < len(generated_responses):
-                    legacy_messages.append({"role": "assistant", "content": generated_responses[i]})
-            messages = legacy_messages + messages
-
-        self.uuid = conversation_id
-        self.messages = messages
-
-    def __eq__(self, other):
-        if not isinstance(other, Conversation):
-            return False
-        return self.uuid == other.uuid or self.messages == other.messages
-
-    def add_message(self, message: Dict[str, str]):
-        if not set(message.keys()) == {"role", "content"}:
-            raise ValueError("Message should contain only 'role' and 'content' keys!")
-        if message["role"] not in ("user", "assistant", "system"):
-            raise ValueError("Only 'user', 'assistant' and 'system' roles are supported for now!")
-        self.messages.append(message)
-
-    def add_user_input(self, text: str, overwrite: bool = False):
-        """
-        Add a user input to the conversation for the next round. This is a legacy method that assumes that inputs must
-        alternate user/assistant/user/assistant, and so will not add multiple user messages in succession. We recommend
-        just using `add_message` with role "user" instead.
-        """
-        if len(self) > 0 and self[-1]["role"] == "user":
-            if overwrite:
-                logger.warning(
-                    f'User input added while unprocessed input was existing: "{self[-1]["content"]}" was overwritten '
-                    f'with: "{text}".'
-                )
-                self[-1]["content"] = text
-            else:
-                logger.warning(
-                    f'User input added while unprocessed input was existing: "{self[-1]["content"]}" new input '
-                    f'ignored: "{text}". Set `overwrite` to True to overwrite unprocessed user input'
-                )
-        else:
-            self.messages.append({"role": "user", "content": text})
-
-    def append_response(self, response: str):
-        """
-        This is a legacy method. We recommend just using `add_message` with an appropriate role instead.
-        """
-        self.messages.append({"role": "assistant", "content": response})
-
-    def mark_processed(self):
-        """
-        This is a legacy method, as the Conversation no longer distinguishes between processed and unprocessed user
-        input. We set a counter here to keep behaviour mostly backward-compatible, but in general you should just read
-        the messages directly when writing new code.
-        """
-        self._num_processed_user_inputs = len(self._user_messages)
-
-    def __iter__(self):
-        for message in self.messages:
-            yield message
-
-    def __getitem__(self, item):
-        return self.messages[item]
-
-    def __setitem__(self, key, value):
-        self.messages[key] = value
-
-    def __len__(self):
-        return len(self.messages)
-
-    def __repr__(self):
-        """
-        Generates a string representation of the conversation.
-
-        Returns:
-            `str`:
-
-        Example:
-            Conversation id: 7d15686b-dc94-49f2-9c4b-c9eac6a1f114 user: Going to the movies tonight - any suggestions?
-            bot: The Big Lebowski
-        """
-        output = f"Conversation id: {self.uuid}\n"
-        for message in self.messages:
-            output += f"{message['role']}: {message['content']}\n"
-        return output
-
-    def iter_texts(self):
-        # This is a legacy method for backwards compatibility. It is recommended to just directly access
-        # conversation.messages instead.
-        for message in self.messages:
-            yield message["role"] == "user", message["content"]
-
-    @property
-    def _user_messages(self):
-        # This is a legacy property for backwards compatibility. It is recommended to just directly access
-        # conversation.messages instead.
-        return [message["content"] for message in self.messages if message["role"] == "user"]
-
-    @property
-    def past_user_inputs(self):
-        # This is a legacy property for backwards compatibility. It is recommended to just directly access
-        # conversation.messages instead. The modern class does not care about which messages are "processed"
-        # or not.
-        if not self._user_messages:
-            return []
-        # In the past, the most recent user message had to be mark_processed() before being included
-        # in past_user_messages. The class essentially had a single-message buffer, representing messages that
-        # had not yet been replied to. This is no longer the case, but we mimic the behaviour in this property
-        # for backward compatibility.
-        if self.messages[-1]["role"] != "user" or self._num_processed_user_inputs == len(self._user_messages):
-            return self._user_messages
-
-        return self._user_messages[:-1]
-
-    @property
-    def generated_responses(self):
-        # This is a legacy property for backwards compatibility. It is recommended to just directly access
-        # conversation.messages instead.
-        return [message["content"] for message in self.messages if message["role"] == "assistant"]
-
-    @property
-    def new_user_input(self):
-        # This is a legacy property for backwards compatibility. It is recommended to just directly access
-        # conversation.messages instead.
-        return self._user_messages[-1]
-
-
-@add_end_docstrings(
-    build_pipeline_init_args(has_tokenizer=True),
-    r"""
-        min_length_for_response (`int`, *optional*, defaults to 32):
-            The minimum length (in number of tokens) for a response.""",
-)
-class ConversationalPipeline(Pipeline):
-    """
-    Multi-turn conversational pipeline.
-
-    Example:
-
-    ```python
-    >>> from transformers import pipeline, Conversation
-    # Any model with a chat template can be used in a ConversationalPipeline.
-
-    >>> chatbot = pipeline(model="facebook/blenderbot-400M-distill")
-    >>> # Conversation objects initialized with a string will treat it as a user message
-    >>> conversation = Conversation("I'm looking for a movie - what's your favourite one?")
-    >>> conversation = chatbot(conversation)
-    >>> conversation.messages[-1]["content"]
-    "I don't really have a favorite movie, but I do like action movies. What about you?"
-
-    >>> conversation.add_message({"role": "user", "content": "That's interesting, why do you like action movies?"})
-    >>> conversation = chatbot(conversation)
-    >>> conversation.messages[-1]["content"]
-    " I think it's just because they're so fast-paced and action-fantastic."
-    ```
-
-    Learn more about the basics of using a pipeline in the [pipeline tutorial](../pipeline_tutorial)
-
-    This conversational pipeline can currently be loaded from [`pipeline`] using the following task identifier:
-    `"conversational"`.
-
-    This pipeline can be used with any model that has a [chat
-    template](https://huggingface.co/docs/transformers/chat_templating) set.
-    """
-
-    def __init__(self, *args, **kwargs):
-        warnings.warn(
-            "`ConversationalPipeline` is now deprecated, and the functionality has been moved to the standard `text-generation` pipeline, which now accepts lists of message dicts as well as strings. This class will be removed in v4.42.",
-            DeprecationWarning,
-        )
-        super().__init__(*args, **kwargs)
-        if self.tokenizer.pad_token_id is None:
-            self.tokenizer.pad_token = self.tokenizer.eos_token
-
-    def _sanitize_parameters(self, min_length_for_response=None, clean_up_tokenization_spaces=None, **generate_kwargs):
-        preprocess_params = {}
-        forward_params = {}
-        postprocess_params = {}
-
-        if min_length_for_response is not None:
-            preprocess_params["min_length_for_response"] = min_length_for_response
-
-        if "max_length" in generate_kwargs:
-            forward_params["max_length"] = generate_kwargs["max_length"]
-            # self.max_length = generate_kwargs.get("max_length", self.model.config.max_length)
-        if clean_up_tokenization_spaces is not None:
-            postprocess_params["clean_up_tokenization_spaces"] = clean_up_tokenization_spaces
-
-        if generate_kwargs:
-            forward_params.update(generate_kwargs)
-        return preprocess_params, forward_params, postprocess_params
-
-    def __call__(self, conversations: Union[List[Dict], Conversation, List[Conversation]], num_workers=0, **kwargs):
-        r"""
-        Generate responses for the conversation(s) given as inputs.
-
-        Args:
-            conversations (a [`Conversation`] or a list of [`Conversation`]):
-                Conversation to generate responses for. Inputs can also be passed as a list of dictionaries with `role`
-                and `content` keys - in this case, they will be converted to `Conversation` objects automatically.
-                Multiple conversations in either format may be passed as a list.
-            clean_up_tokenization_spaces (`bool`, *optional*, defaults to `True`):
-                Whether or not to clean up the potential extra spaces in the text output.
-            generate_kwargs:
-                Additional keyword arguments to pass along to the generate method of the model (see the generate method
-                corresponding to your framework [here](./main_classes/text_generation)).
-
-        Returns:
-            [`Conversation`] or a list of [`Conversation`]: Conversation(s) with updated generated responses for those
-            containing a new user input.
-        """
-        # XXX: num_workers==0 is required to be backward compatible
-        # Otherwise the threads will require a Conversation copy.
-        # This will definitely hinder performance on GPU, but has to be opted
-        # in because of this BC change.
-        if isinstance(conversations, list) and isinstance(conversations[0], dict):
-            conversations = Conversation(conversations)
-        elif isinstance(conversations, list) and isinstance(conversations[0], list):
-            conversations = [Conversation(conv) for conv in conversations]
-        outputs = super().__call__(conversations, num_workers=num_workers, **kwargs)
-        if isinstance(outputs, list) and len(outputs) == 1:
-            return outputs[0]
-        return outputs
-
-    def preprocess(self, conversation: Conversation, min_length_for_response=32) -> Dict[str, Any]:
-        input_ids = self.tokenizer.apply_chat_template(conversation, add_generation_prompt=True)
-
-        if self.framework == "pt":
-            input_ids = torch.LongTensor([input_ids])
-        elif self.framework == "tf":
-            input_ids = tf.constant([input_ids])
-        return {"input_ids": input_ids, "conversation": conversation}
-
-    def _forward(self, model_inputs, **generate_kwargs):
-        n = model_inputs["input_ids"].shape[1]
-        conversation = model_inputs.pop("conversation")
-        if "max_length" not in generate_kwargs and "max_new_tokens" not in generate_kwargs:
-            generate_kwargs["max_new_tokens"] = 256
-        output_ids = self.model.generate(**model_inputs, **generate_kwargs)
-        if self.model.config.is_encoder_decoder:
-            start_position = 1
-        else:
-            start_position = n
-        return {"output_ids": output_ids[:, start_position:], "conversation": conversation}
-
-    def postprocess(self, model_outputs, clean_up_tokenization_spaces=True):
-        output_ids = model_outputs["output_ids"]
-        answer = self.tokenizer.decode(
-            output_ids[0],
-            skip_special_tokens=True,
-            clean_up_tokenization_spaces=clean_up_tokenization_spaces,
-        )
-        conversation = model_outputs["conversation"]
-        conversation.add_message({"role": "assistant", "content": answer})
-        return conversation
--- a/src/transformers/tokenization_utils_base.py
+++ b/src/transformers/tokenization_utils_base.py
@@ -72,8 +72,6 @@ if TYPE_CHECKING:
        import tensorflow as tf
    if is_flax_available():
        import jax.numpy as jnp  # noqa: F401
-    from .pipelines.conversational import Conversation
-

 if is_tokenizers_available():
    from tokenizers import AddedToken
@@ -1684,7 +1682,7 @@ class PreTrainedTokenizerBase(SpecialTokensMixin, PushToHubMixin):

    def apply_chat_template(
        self,
-        conversation: Union[List[Dict[str, str]], List[List[Dict[str, str]]], "Conversation"],
+        conversation: Union[List[Dict[str, str]], List[List[Dict[str, str]]]],
        chat_template: Optional[str] = None,
        add_generation_prompt: bool = False,
        tokenize: bool = True,
@@ -1703,7 +1701,7 @@ class PreTrainedTokenizerBase(SpecialTokensMixin, PushToHubMixin):
        to the default_chat_template specified at the class level.

        Args:
-            conversation (Union[List[Dict[str, str]], List[List[Dict[str, str]]], "Conversation"]): A list of dicts
+            conversation (Union[List[Dict[str, str]], List[List[Dict[str, str]]]]): A list of dicts
                with "role" and "content" keys, representing the chat history so far.
            chat_template (str, *optional*): A Jinja template to use for this conversion. If
                this is not passed, the model's default chat template will be used instead.

--- a/tests/models/bart/test_modeling_bart.py
+++ b/tests/models/bart/test_modeling_bart.py
@@ -430,7 +430,6 @@ class BartModelTest(ModelTesterMixin, GenerationTesterMixin, PipelineTesterMixin
    all_generative_model_classes = (BartForConditionalGeneration,) if is_torch_available() else ()
    pipeline_model_mapping = (
        {
-            "conversational": BartForConditionalGeneration,
            "feature-extraction": BartModel,
            "fill-mask": BartForConditionalGeneration,
            "question-answering": BartForQuestionAnswering,
@@ -513,10 +512,6 @@ class BartModelTest(ModelTesterMixin, GenerationTesterMixin, PipelineTesterMixin
        model.generate(input_ids, attention_mask=attention_mask)
        model.generate(num_beams=4, do_sample=True, early_stopping=False, num_return_sequences=3)

-    @unittest.skip("Does not support conversations.")
-    def test_pipeline_conversational(self):
-        pass
-

 def assert_tensors_close(a, b, atol=1e-12, prefix=""):
    """If tensors have different shapes, different values or a and b are not both tensors, raise a nice Assertion error."""

--- a/tests/models/bart/test_modeling_tf_bart.py
+++ b/tests/models/bart/test_modeling_tf_bart.py
@@ -198,7 +198,6 @@ class TFBartModelTest(TFModelTesterMixin, TFCoreModelTesterMixin, PipelineTester
    all_generative_model_classes = (TFBartForConditionalGeneration,) if is_tf_available() else ()
    pipeline_model_mapping = (
        {
-            "conversational": TFBartForConditionalGeneration,
            "feature-extraction": TFBartModel,
            "summarization": TFBartForConditionalGeneration,
            "text-classification": TFBartForSequenceClassification,
@@ -343,10 +342,6 @@ class TFBartModelTest(TFModelTesterMixin, TFCoreModelTesterMixin, PipelineTester
                # check that the output for the restored model is the same
                self.assert_outputs_same(restored_model_outputs, outputs)

-    @unittest.skip("Does not support conversations.")
-    def test_pipeline_conversational(self):
-        pass
-

 def _long_tensor(tok_lst):
    return tf.constant(tok_lst, dtype=tf.int32)

--- a/tests/models/bigbird_pegasus/test_modeling_bigbird_pegasus.py
+++ b/tests/models/bigbird_pegasus/test_modeling_bigbird_pegasus.py
@@ -253,7 +253,6 @@ class BigBirdPegasusModelTest(ModelTesterMixin, GenerationTesterMixin, PipelineT
    all_generative_model_classes = (BigBirdPegasusForConditionalGeneration,) if is_torch_available() else ()
    pipeline_model_mapping = (
        {
-            "conversational": BigBirdPegasusForConditionalGeneration,
            "feature-extraction": BigBirdPegasusModel,
            "question-answering": BigBirdPegasusForQuestionAnswering,
            "summarization": BigBirdPegasusForConditionalGeneration,

--- a/tests/models/blenderbot/test_modeling_blenderbot.py
+++ b/tests/models/blenderbot/test_modeling_blenderbot.py
@@ -237,7 +237,6 @@ class BlenderbotModelTest(ModelTesterMixin, GenerationTesterMixin, PipelineTeste
    all_generative_model_classes = (BlenderbotForConditionalGeneration,) if is_torch_available() else ()
    pipeline_model_mapping = (
        {
-            "conversational": BlenderbotForConditionalGeneration,
            "feature-extraction": BlenderbotModel,
            "summarization": BlenderbotForConditionalGeneration,
            "text-generation": BlenderbotForCausalLM,

--- a/tests/models/blenderbot/test_modeling_tf_blenderbot.py
+++ b/tests/models/blenderbot/test_modeling_tf_blenderbot.py
@@ -183,7 +183,6 @@ class TFBlenderbotModelTest(TFModelTesterMixin, PipelineTesterMixin, unittest.Te
    all_generative_model_classes = (TFBlenderbotForConditionalGeneration,) if is_tf_available() else ()
    pipeline_model_mapping = (
        {
-            "conversational": TFBlenderbotForConditionalGeneration,
            "feature-extraction": TFBlenderbotModel,
            "summarization": TFBlenderbotForConditionalGeneration,
            "text2text-generation": TFBlenderbotForConditionalGeneration,

--- a/tests/models/blenderbot_small/test_modeling_blenderbot_small.py
+++ b/tests/models/blenderbot_small/test_modeling_blenderbot_small.py
@@ -228,7 +228,6 @@ class BlenderbotSmallModelTest(ModelTesterMixin, GenerationTesterMixin, Pipeline
    all_generative_model_classes = (BlenderbotSmallForConditionalGeneration,) if is_torch_available() else ()
    pipeline_model_mapping = (
        {
-            "conversational": BlenderbotSmallForConditionalGeneration,
            "feature-extraction": BlenderbotSmallModel,
            "summarization": BlenderbotSmallForConditionalGeneration,
            "text-generation": BlenderbotSmallForCausalLM,
@@ -247,7 +246,7 @@ class BlenderbotSmallModelTest(ModelTesterMixin, GenerationTesterMixin, Pipeline
    def is_pipeline_test_to_skip(
        self, pipeline_test_casse_name, config_class, model_architecture, tokenizer_name, processor_name
    ):
-        return pipeline_test_casse_name in ("TextGenerationPipelineTests", "ConversationalPipelineTests")
+        return pipeline_test_casse_name == "TextGenerationPipelineTests"

    def setUp(self):
        self.model_tester = BlenderbotSmallModelTester(self)

--- a/tests/models/blenderbot_small/test_modeling_flax_blenderbot_small.py
+++ b/tests/models/blenderbot_small/test_modeling_flax_blenderbot_small.py
@@ -323,7 +323,7 @@ class FlaxBlenderbotSmallModelTest(FlaxModelTesterMixin, unittest.TestCase, Flax
    def is_pipeline_test_to_skip(
        self, pipeline_test_casse_name, config_class, model_architecture, tokenizer_name, processor_name
    ):
-        return pipeline_test_casse_name in ("TextGenerationPipelineTests", "ConversationalPipelineTests")
+        return pipeline_test_casse_name == "TextGenerationPipelineTests"

    def setUp(self):
        self.model_tester = FlaxBlenderbotSmallModelTester(self)

--- a/tests/models/blenderbot_small/test_modeling_tf_blenderbot_small.py
+++ b/tests/models/blenderbot_small/test_modeling_tf_blenderbot_small.py
@@ -185,7 +185,6 @@ class TFBlenderbotSmallModelTest(TFModelTesterMixin, PipelineTesterMixin, unitte
    all_generative_model_classes = (TFBlenderbotSmallForConditionalGeneration,) if is_tf_available() else ()
    pipeline_model_mapping = (
        {
-            "conversational": TFBlenderbotSmallForConditionalGeneration,
            "feature-extraction": TFBlenderbotSmallModel,
            "summarization": TFBlenderbotSmallForConditionalGeneration,
            "text2text-generation": TFBlenderbotSmallForConditionalGeneration,
@@ -201,7 +200,7 @@ class TFBlenderbotSmallModelTest(TFModelTesterMixin, PipelineTesterMixin, unitte
    def is_pipeline_test_to_skip(
        self, pipeline_test_casse_name, config_class, model_architecture, tokenizer_name, processor_name
    ):
-        return pipeline_test_casse_name in ("TextGenerationPipelineTests", "ConversationalPipelineTests")
+        return pipeline_test_casse_name == "TextGenerationPipelineTests"

    def setUp(self):
        self.model_tester = TFBlenderbotSmallModelTester(self)

--- a/tests/models/fsmt/test_modeling_fsmt.py
+++ b/tests/models/fsmt/test_modeling_fsmt.py
@@ -166,7 +166,6 @@ class FSMTModelTest(ModelTesterMixin, GenerationTesterMixin, PipelineTesterMixin
    all_generative_model_classes = (FSMTForConditionalGeneration,) if is_torch_available() else ()
    pipeline_model_mapping = (
        {
-            "conversational": FSMTForConditionalGeneration,
            "feature-extraction": FSMTModel,
            "summarization": FSMTForConditionalGeneration,
            "text2text-generation": FSMTForConditionalGeneration,