"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "1b75d7238cd68719f8350bc8014135b8aef6e41b"
Unverified Commit eb5b5ce6 authored by Sylvain Gugger's avatar Sylvain Gugger Committed by GitHub
Browse files

Render custom tool docs a bit better (#23269)

* Try on a couple of blocks to see

* Build the doc please

* Build the doc please

* Build the doc please

* add more

* Finish with all

* Style
parent 42017d82
...@@ -54,7 +54,7 @@ The prompt is structured broadly into four parts. ...@@ -54,7 +54,7 @@ The prompt is structured broadly into four parts.
To better understand each part, let's look at a shortened version of how the `run` prompt can look like: To better understand each part, let's look at a shortened version of how the `run` prompt can look like:
```` ````text
I will ask you to perform a task, your job is to come up with a series of simple commands in Python that will perform the task. I will ask you to perform a task, your job is to come up with a series of simple commands in Python that will perform the task.
[...] [...]
You can print intermediate results if it makes sense to do so. You can print intermediate results if it makes sense to do so.
...@@ -101,7 +101,7 @@ The second part (the bullet points below *"Tools"*) is dynamically added upon ca ...@@ -101,7 +101,7 @@ The second part (the bullet points below *"Tools"*) is dynamically added upon ca
exactly as many bullet points as there are tools in `agent.toolbox` and each bullet point consists of the name exactly as many bullet points as there are tools in `agent.toolbox` and each bullet point consists of the name
and description of the tool: and description of the tool:
``` ```text
- <tool.name>: <tool.description> - <tool.name>: <tool.description>
``` ```
...@@ -115,7 +115,7 @@ print(f"- {document_qa.name}: {document_qa.description}") ...@@ -115,7 +115,7 @@ print(f"- {document_qa.name}: {document_qa.description}")
``` ```
which gives: which gives:
``` ```text
- document_qa: This is a tool that answers a question about a document (pdf). It takes an input named `document` which should be the document containing the information, as well as a `question` that is the question about the document. It returns a text that contains the answer to the question. - document_qa: This is a tool that answers a question about a document (pdf). It takes an input named `document` which should be the document containing the information, as well as a `question` that is the question about the document. It returns a text that contains the answer to the question.
``` ```
...@@ -143,7 +143,7 @@ executable code in practice. ...@@ -143,7 +143,7 @@ executable code in practice.
Let's have a look at one example: Let's have a look at one example:
```` ````text
Task: "Identify the oldest person in the `document` and create an image showcasing the result as a banner." Task: "Identify the oldest person in the `document` and create an image showcasing the result as a banner."
I will use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer. I will use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.
...@@ -166,7 +166,7 @@ The prompt examples are curated by the Transformers team and rigorously evaluate ...@@ -166,7 +166,7 @@ The prompt examples are curated by the Transformers team and rigorously evaluate
to ensure that the agent's prompt is as good as possible to solve real use cases of the agent. to ensure that the agent's prompt is as good as possible to solve real use cases of the agent.
The final part of the prompt corresponds to: The final part of the prompt corresponds to:
``` ```text
Task: "Draw me a picture of rivers and lakes" Task: "Draw me a picture of rivers and lakes"
I will use the following I will use the following
...@@ -187,7 +187,7 @@ exactly in the same way it was previously done in the examples. ...@@ -187,7 +187,7 @@ exactly in the same way it was previously done in the examples.
Without going into too much detail, the chat template has the same prompt structure with the Without going into too much detail, the chat template has the same prompt structure with the
examples having a slightly different style, *e.g.*: examples having a slightly different style, *e.g.*:
```` ````text
[...] [...]
===== =====
...@@ -225,8 +225,8 @@ to past exchanges as is done *e.g.* above by the user's input of "I tried **this ...@@ -225,8 +225,8 @@ to past exchanges as is done *e.g.* above by the user's input of "I tried **this
previously generated code of the agent. previously generated code of the agent.
Upon running `.chat`, the user's input or *task* is cast into an unfinished example of the form: Upon running `.chat`, the user's input or *task* is cast into an unfinished example of the form:
``` ```text
Human: <user-input>\n\nAssistent: Human: <user-input>\n\nAssistant:
``` ```
which the agent completes. Contrary to the `run` command, the `chat` command then appends the completed example which the agent completes. Contrary to the `run` command, the `chat` command then appends the completed example
to the prompt, thus giving the agent more context for the next `chat` turn. to the prompt, thus giving the agent more context for the next `chat` turn.
...@@ -254,7 +254,7 @@ agent.run("Show me a tree", return_code=True) ...@@ -254,7 +254,7 @@ agent.run("Show me a tree", return_code=True)
gives: gives:
``` ```text
==Explanation from the agent== ==Explanation from the agent==
I will use the following tool: `image_segmenter` to create a segmentation mask for the image. I will use the following tool: `image_segmenter` to create a segmentation mask for the image.
...@@ -269,7 +269,8 @@ are present in the tool's name and description. Let's have a look. ...@@ -269,7 +269,8 @@ are present in the tool's name and description. Let's have a look.
```py ```py
agent.toolbox["image_generator"].description agent.toolbox["image_generator"].description
``` ```
```
```text
'This is a tool that creates an image according to a prompt, which is a text description. It takes an input named `prompt` which contains the image description and outputs an image. 'This is a tool that creates an image according to a prompt, which is a text description. It takes an input named `prompt` which contains the image description and outputs an image.
``` ```
...@@ -280,7 +281,7 @@ agent.run("Create an image of a tree", return_code=True) ...@@ -280,7 +281,7 @@ agent.run("Create an image of a tree", return_code=True)
``` ```
gives: gives:
``` ```text
==Explanation from the agent== ==Explanation from the agent==
I will use the following tool `image_generator` to generate an image of a tree. I will use the following tool `image_generator` to generate an image of a tree.
...@@ -307,7 +308,7 @@ used a lot for image generation tasks, *e.g.* ...@@ -307,7 +308,7 @@ used a lot for image generation tasks, *e.g.*
agent.run("Make an image of a house and a car", return_code=True) agent.run("Make an image of a house and a car", return_code=True)
``` ```
returns returns
``` ```text
==Explanation from the agent== ==Explanation from the agent==
I will use the following tools `image_generator` to generate an image of a house and `image_transformer` to transform the image of a car into the image of a house. I will use the following tools `image_generator` to generate an image of a house and `image_transformer` to transform the image of a car into the image of a house.
...@@ -322,9 +323,11 @@ to understand the difference between `image_generator` and `image_transformer` a ...@@ -322,9 +323,11 @@ to understand the difference between `image_generator` and `image_transformer` a
We can help the agent here by changing the tool name and description of `image_transformer`. Let's instead call it `modifier` We can help the agent here by changing the tool name and description of `image_transformer`. Let's instead call it `modifier`
to disassociate it a bit from "image" and "prompt": to disassociate it a bit from "image" and "prompt":
``` ```py
agent.toolbox["modifier"] = agent.toolbox.pop("image_transformer") agent.toolbox["modifier"] = agent.toolbox.pop("image_transformer")
agent.toolbox["modifier"].description = agent.toolbox["modifier"].description.replace("transforms an image according to a prompt", "modifies an image") agent.toolbox["modifier"].description = agent.toolbox["modifier"].description.replace(
"transforms an image according to a prompt", "modifies an image"
)
``` ```
Now "modify" is a strong cue to use the new image processor which should help with the above prompt. Let's run it again. Now "modify" is a strong cue to use the new image processor which should help with the above prompt. Let's run it again.
...@@ -334,7 +337,7 @@ agent.run("Make an image of a house and a car", return_code=True) ...@@ -334,7 +337,7 @@ agent.run("Make an image of a house and a car", return_code=True)
``` ```
Now we're getting: Now we're getting:
``` ```text
==Explanation from the agent== ==Explanation from the agent==
I will use the following tools: `image_generator` to generate an image of a house, then `image_generator` to generate an image of a car. I will use the following tools: `image_generator` to generate an image of a house, then `image_generator` to generate an image of a car.
...@@ -350,7 +353,7 @@ which is definitely closer to what we had in mind! However, we want to have both ...@@ -350,7 +353,7 @@ which is definitely closer to what we had in mind! However, we want to have both
agent.run("Create image: 'A house and car'", return_code=True) agent.run("Create image: 'A house and car'", return_code=True)
``` ```
``` ```text
==Explanation from the agent== ==Explanation from the agent==
I will use the following tool: `image_generator` to generate an image. I will use the following tool: `image_generator` to generate an image.
...@@ -389,7 +392,7 @@ of the tools, it has available to it as well as correctly insert the user's prom ...@@ -389,7 +392,7 @@ of the tools, it has available to it as well as correctly insert the user's prom
</Tip> </Tip>
Similarly, one can overwrite the `chat` prompt template. Note that the `chat` mode always uses the following format for the exchanges: Similarly, one can overwrite the `chat` prompt template. Note that the `chat` mode always uses the following format for the exchanges:
``` ```text
Human: <<task>> Human: <<task>>
Assistant: Assistant:
...@@ -441,7 +444,7 @@ print(f"Name: '{controlnet_transformer.name}'") ...@@ -441,7 +444,7 @@ print(f"Name: '{controlnet_transformer.name}'")
``` ```
gives gives
``` ```text
Description: 'This is a tool that transforms an image with ControlNet according to a prompt. Description: 'This is a tool that transforms an image with ControlNet according to a prompt.
It takes two inputs: `image`, which should be the image to transform, and `prompt`, which should be the prompt to use to change it. It returns the modified image.' It takes two inputs: `image`, which should be the image to transform, and `prompt`, which should be the prompt to use to change it. It returns the modified image.'
Name: 'image_transformer' Name: 'image_transformer'
...@@ -457,7 +460,7 @@ agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder", ...@@ -457,7 +460,7 @@ agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder",
This command should give you the following info: This command should give you the following info:
``` ```text
image_transformer has been replaced by <transformers_modules.diffusers.controlnet-canny-tool.bd76182c7777eba9612fc03c0 image_transformer has been replaced by <transformers_modules.diffusers.controlnet-canny-tool.bd76182c7777eba9612fc03c0
8718a60c0aa6312.image_transformation.ControlNetTransformationTool object at 0x7f1d3bfa3a00> as provided in `additional_tools` 8718a60c0aa6312.image_transformation.ControlNetTransformationTool object at 0x7f1d3bfa3a00> as provided in `additional_tools`
``` ```
...@@ -480,7 +483,7 @@ You can always have a look at the toolbox that is currently available to the age ...@@ -480,7 +483,7 @@ You can always have a look at the toolbox that is currently available to the age
print("\n".join([f"- {a}" for a in agent.toolbox.keys()])) print("\n".join([f"- {a}" for a in agent.toolbox.keys()]))
``` ```
``` ```text
- document_qa - document_qa
- image_captioner - image_captioner
- image_qa - image_qa
...@@ -518,7 +521,7 @@ Let's transform the image into a beautiful winter landscape: ...@@ -518,7 +521,7 @@ Let's transform the image into a beautiful winter landscape:
image = agent.run("Transform the image: 'A frozen lake and snowy forest'", image=image) image = agent.run("Transform the image: 'A frozen lake and snowy forest'", image=image)
``` ```
``` ```text
==Explanation from the agent== ==Explanation from the agent==
I will use the following tool: `image_transformer` to transform the image. I will use the following tool: `image_transformer` to transform the image.
...@@ -536,7 +539,7 @@ By default the image processing tool returns an image of size 512x512 pixels. Le ...@@ -536,7 +539,7 @@ By default the image processing tool returns an image of size 512x512 pixels. Le
image = agent.run("Upscale the image", image) image = agent.run("Upscale the image", image)
``` ```
``` ```text
==Explanation from the agent== ==Explanation from the agent==
I will use the following tool: `image_upscaler` to upscale the image. I will use the following tool: `image_upscaler` to upscale the image.
...@@ -657,7 +660,7 @@ agent.run( ...@@ -657,7 +660,7 @@ agent.run(
) )
``` ```
which outputs the following: which outputs the following:
``` ```text
==Code generated by the agent== ==Code generated by the agent==
model = model_download_counter(task="text-to-video") model = model_download_counter(task="text-to-video")
print(f"The model with the most downloads is {model}.") print(f"The model with the most downloads is {model}.")
...@@ -738,7 +741,7 @@ agent.run("Generate an image of the `prompt` after improving it.", prompt="A rab ...@@ -738,7 +741,7 @@ agent.run("Generate an image of the `prompt` after improving it.", prompt="A rab
``` ```
The model adequately leverages the tool: The model adequately leverages the tool:
``` ```text
==Explanation from the agent== ==Explanation from the agent==
I will use the following tools: `StableDiffusionPromptGenerator` to improve the prompt, then `image_generator` to generate an image according to the improved prompt. I will use the following tools: `StableDiffusionPromptGenerator` to improve the prompt, then `image_generator` to generate an image according to the improved prompt.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment