"vscode:/vscode.git/clone" did not exist on "d0d5aee1dd26de108a901921b9df19b889430645"
Unverified Commit f93509b1 authored by Sylvain Gugger's avatar Sylvain Gugger Committed by GitHub
Browse files

Refine documentation for Tools (#23266)

* refine documentation for Tools

* + one bugfix
parent 5f26a23d
......@@ -102,7 +102,7 @@
- local: community
title: Community resources
- local: custom_tools
title: Custom Tools
title: Custom Tools and Prompts
- local: troubleshooting
title: Troubleshoot
title: Developer guides
......
......@@ -124,7 +124,9 @@ what the tool does and the second states what input arguments and return values
A good tool name and tool description are very important for the agent to correctly use it. Note that the only
information the agent has about the tool is its name and description, so one should make sure that both
are precisely written and match the style of the existing tools in the toolbox.
are precisely written and match the style of the existing tools in the toolbox. In particular make sure the description
mentions all the arguments expected by name in code-style, along with the expected type and a description of what they
are.
<Tip>
......@@ -137,7 +139,7 @@ The third part includes a set of curated examples that show the agent exactly wh
for what kind of user request. The large language models empowering the agent are extremely good at
recognizing patterns in a prompt and repeating the pattern with new data. Therefore, it is very important
that the examples are written in a way that maximizes the likelihood of the agent to generating correct,
executable code in practice.
executable code in practice.
Let's have a look at one example:
......@@ -466,7 +468,8 @@ The set of curated tools already has an `image_transformer` tool which is hereby
Overwriting existing tools can be beneficial if we want to use a custom tool exactly for the same task as an existing tool
because the agent is well-versed in using the specific task. Beware that the custom tool should follow the exact same API
as the overwritten tool in this case.
as the overwritten tool in this case, or you should adapt the prompt template to make sure all examples using that
tool are updated.
</Tip>
......@@ -627,14 +630,14 @@ In order to let others benefit from it and for simpler initialization, we recomm
namespace. To do so, just call `push_to_hub` on the `tool` variable:
```python
tool.push_to_hub("lysandre/hf-model-downloads")
tool.push_to_hub("hf-model-downloads")
```
You now have your code on the Hub! Let's take a look at the final step, which is to have the agent use it.
#### Having the agent use the tool
We now have our tool that lives on the Hub which can be instantiated as such:
We now have our tool that lives on the Hub which can be instantiated as such (change the user name for your tool):
```python
from transformers import load_tool
......
......@@ -19,7 +19,8 @@ can vary as the APIs or underlying models are prone to change.
</Tip>
Transformers version v4.29.0, building on the concept of *tools* and *agents*.
Transformers version v4.29.0, building on the concept of *tools* and *agents*. You can play with in
[this colab](https://colab.research.google.com/drive/1c7MHD-T1forUPGcC_jlwsIptOzpG3hSj).
In short, it provides a natural language API on top of transformers: we define a set of curated tools and design an
agent to interpret natural language and to use these tools. It is extensible by design; we curated some relevant tools,
......@@ -60,10 +61,19 @@ agent.run(
## Quickstart
Before being able to use `agent.run`, you will need to instantiate an agent, which is a large language model (LLM).
We recommend using the [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) checkpoint as it works very well
for the task at hand and is open-source, but please find other examples below.
We provide support for openAI models as well as opensource alternatives from BigCode and OpenAssistant. The openAI
models perform better (but require you to have an openAI API key, so cannot be used for free); Hugging Face is
providing free access to endpoints for BigCode and OpenAssistant models.
Start by logging in to have access to the Inference API:
To use openAI models, you instantiate an [`OpenAiAgent`]:
```py
from transformers import OpenAiAgent
agent = OpenAiAgent(model="text-davinci-003", api_key="<your_api_key>")
```
To use BigCode or OpenAssistant, start by logging in to have access to the Inference API:
```py
from huggingface_hub import login
......@@ -76,17 +86,22 @@ Then, instantiate the agent
```py
from transformers import HfAgent
# Starcoder
agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder")
# StarcoderBase
# agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoderbase")
# OpenAssistant
# agent = HfAgent(url_endpoint="https://api-inference.huggingface.co/models/OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5")
```
This is using the inference API that Hugging Face provides for free at the moment if you have your inference
This is using the inference API that Hugging Face provides for free at the moment. If you have your own inference
endpoint for this model (or another one) you can replace the URL above with your URL endpoint.
<Tip>
We're showcasing StarCoder as the default in the documentation as the model is free to use and performs admirably well
on simple tasks. However, the checkpoint doesn't hold up when handling more complex prompts. If you're facing such an
issue, we recommend trying out the OpenAI model which, while sadly not open-source, performs better at this given time.
StarCoder and OpenAssistant are free to use and perform admirably well on simple tasks. However, the checkpoints
don't hold up when handling more complex prompts. If you're facing such an issue, we recommend trying out the OpenAI
model which, while sadly not open-source, performs better at this given time.
</Tip>
......@@ -97,7 +112,7 @@ You're now good to go! Let's dive into the two APIs that you now have at your di
The single execution method is when using the [`~Agent.run`] method of the agent:
```py
agent.run("Draw me a picture of rivers and lakes")
agent.run("Draw me a picture of rivers and lakes.")
```
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rivers_and_lakes.png" width=200>
......@@ -107,7 +122,7 @@ can perform one or several tasks in the same instruction (though the more comple
the agent is to fail).
```py
agent.chat("Draw me a picture of the sea then transform the picture to add an island.")
agent.run("Draw me a picture of the sea then transform the picture to add an island")
```
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/sea_and_island.png" width=200>
......@@ -118,15 +133,16 @@ agent.chat("Draw me a picture of the sea then transform the picture to add an is
Every [`~Agent.run`] operation is independent, so you can run it several times in a row with different tasks.
Note that your `agent` is just a large-language model, so small variations in your prompt might yield completely
different results. It's important to explain as clearly as possible the task you want to perform.
different results. It's important to explain as clearly as possible the task you want to perform. We go more in-depth
on how to write good prompts [here](custom_tools#writing-good-user-inputs).
If you'd like to keep a state across executions or to pass non-text objects to the agent, you can do so by specifying
variables that you would like the agent to use. For example, you could generate the first image of rivers and lakes,
and ask the model to update that picture to add an island by doing the following:
```python
picture = agent.run("Draw me a picture of rivers and lakes")
updated_picture = agent.chat("Take that `picture` and add an island to it", picture=picture)
picture = agent.run("Generate a picture of rivers and lakes.")
updated_picture = agent.run("Transform the image in `picture` to add an island to it.", picture=picture)
```
<Tip>
......@@ -155,7 +171,7 @@ agent.run("Draw me a picture of the `prompt`", prompt="a capybara swimming in th
The agent also has a chat-based approach, using the [`~Agent.chat`] method:
```py
agent.chat("Draw me a picture of rivers and lakes")
agent.chat("Generate a picture of rivers and lakes")
```
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rivers_and_lakes.png" width=200>
......@@ -197,6 +213,8 @@ agent.chat("Draw me a picture of rivers and lakes", remote=True)
### What's happening here? What are tools, and what are agents?
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/diagram.png">
#### Agents
The "agent" here is a large language model, and we're prompting it so that it has access to a specific set of tools.
......@@ -270,6 +288,7 @@ directly with the agent. We've added a few
- **Text downloader**: to download a text from a web URL
- **Text to image**: generate an image according to a prompt, leveraging stable diffusion
- **Image transformation**: modify an image given an initial image and a prompt, leveraging instruct pix2pix stable diffusion
- **Text to video**: generate a small video according to a prompt, leveraging damo-vilab
The text-to-image tool we have been using since the beginning is a remote tool that lives in
[*huggingface-tools/text-to-image*](https://huggingface.co/spaces/huggingface-tools/text-to-image)! We will
......@@ -278,32 +297,6 @@ continue releasing such tools on this and other organizations, to further superc
The agents have by default access to tools that reside on `huggingface-tools`.
We explain how to you can write and share your tools as well as leverage any custom tool that resides on the Hub in [following guide](custom_tools).
### Leveraging different agents
We showcase here how to use the [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) model as an LLM, but
it isn't the only model available. We also support the OpenAssistant model and OpenAI's davinci models (3.5 and 4).
We're planning on supporting local language models in an ulterior version.
The tools defined in this implementation are agnostic to the agent used; we are showcasing the agents that work with
our prompts below, but the tools can also be used with Langchain, Minichain, or any other Agent-based library.
#### Example code for the OpenAssistant model
```py
from transformers import HfAgent
agent = HfAgent(url_endpoint="https://OpenAssistant/oasst-sft-1-pythia-12b", token="<HF_TOKEN>")
```
#### Example code for OpenAI models
```py
from transformers import OpenAiAgent
agent = OpenAiAgent(model="text-davinci-003", api_key="<API_KEY>")
```
### Code generation
So far we have shown how to use the agents to perform actions for you. However, the agent is only generating code
......
......@@ -264,7 +264,9 @@ class Agent:
"""
prompt = self.format_prompt(task, chat_mode=True)
result = self.generate_one(prompt, stop=["Human:", "====="])
self.chat_history = prompt + result + "\n"
self.chat_history = prompt + result
if not self.chat_history.endswith("\n"):
self.chat_history += "\n"
explanation, code = clean_code_for_chat(result)
print(f"==Explanation from the agent==\n{explanation}")
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment