1. it follows the [messages format](./chat_templating.md)for its input (`List[Dict[str, str]]`) and returns a `str`
1. it follows the [messages format](./chat_templating.md)(`List[Dict[str, str]]`) for its input `messages`, and it returns a `str`.
2. it stops generating outputs at the sequences passed in the argument `stop`
2. it stops generating outputs at the sequences passed in the argument `stop_sequences`
You also need a `tools` argument which accepts a list of `Tools`. You can provide an empty list for `tools`, but use the default toolbox with the optional argument `add_base_tools=True`.
Additionally, `llm_engine` can also take a `grammar` argument. In the case where you specify a `grammar` upon agent initialization, this argument will be passed to the calls to llm_engine, with the `grammar` that you defined upon initialization, to allow [constrained generation](https://huggingface.co/docs/text-generation-inference/conceptual/guidance) in order to force properly-formatted agent outputs.
You will also need a `tools` argument which accepts a list of `Tools` - it can be an empty list. You can also add the default toolbox on top of your `tools` list by defining the optional argument `add_base_tools=True`.
Now you can create an agent, like [`CodeAgent`], and run it. For convenience, we also provide the [`HfEngine`] class that uses `huggingface_hub.InferenceClient` under the hood.
Now you can create an agent, like [`CodeAgent`], and run it. For convenience, we also provide the [`HfEngine`] class that uses `huggingface_hub.InferenceClient` under the hood.
Task: "Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French."
Task: "Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French."
I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.
Thought: I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.
@@ -75,7 +75,7 @@ final_answer(f"The answer is {answer}")
...
@@ -75,7 +75,7 @@ final_answer(f"The answer is {answer}")
---
---
Task: "Identify the oldest person in the `document` and create an image showcasing the result."
Task: "Identify the oldest person in the `document` and create an image showcasing the result."
I will use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.
Thought: I will use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.
Code:
Code:
```py
```py
answer = document_qa(document, question="What is the oldest person?")
answer = document_qa(document, question="What is the oldest person?")
...
@@ -87,7 +87,7 @@ final_answer(image)
...
@@ -87,7 +87,7 @@ final_answer(image)
---
---
Task: "Generate an image using the text given in the variable `caption`."
Task: "Generate an image using the text given in the variable `caption`."
I will use the following tool: `image_generator` to generate an image.
Thought: I will use the following tool: `image_generator` to generate an image.
Code:
Code:
```py
```py
image = image_generator(prompt=caption)
image = image_generator(prompt=caption)
...
@@ -97,7 +97,7 @@ final_answer(image)
...
@@ -97,7 +97,7 @@ final_answer(image)
---
---
Task: "Summarize the text given in the variable `text` and read it out loud."
Task: "Summarize the text given in the variable `text` and read it out loud."
I will use the following tools: `summarizer` to create a summary of the input text, then `text_reader` to read it out loud.
Thought: I will use the following tools: `summarizer` to create a summary of the input text, then `text_reader` to read it out loud.
Code:
Code:
```py
```py
summarized_text = summarizer(text)
summarized_text = summarizer(text)
...
@@ -109,7 +109,7 @@ final_answer(audio_summary)
...
@@ -109,7 +109,7 @@ final_answer(audio_summary)
---
---
Task: "Answer the question in the variable `question` about the text in the variable `text`. Use the answer to generate an image."
Task: "Answer the question in the variable `question` about the text in the variable `text`. Use the answer to generate an image."
I will use the following tools: `text_qa` to create the answer, then `image_generator` to generate an image according to the answer.
Thought: I will use the following tools: `text_qa` to create the answer, then `image_generator` to generate an image according to the answer.
Code:
Code:
```py
```py
answer = text_qa(text=text, question=question)
answer = text_qa(text=text, question=question)
...
@@ -121,7 +121,7 @@ final_answer(image)
...
@@ -121,7 +121,7 @@ final_answer(image)
---
---
Task: "Caption the following `image`."
Task: "Caption the following `image`."
I will use the following tool: `image_captioner` to generate a caption for the image.
Thought: I will use the following tool: `image_captioner` to generate a caption for the image.
Code:
Code:
```py
```py
caption = image_captioner(image)
caption = image_captioner(image)
...
@@ -292,7 +292,6 @@ print(answer)
...
@@ -292,7 +292,6 @@ print(answer)
Observation: "The oldest person in the document is John Doe, a 55 year old lumberjack living in Newfoundland."
Observation: "The oldest person in the document is John Doe, a 55 year old lumberjack living in Newfoundland."
Thought: I will now generate an image showcasing the oldest person.
Thought: I will now generate an image showcasing the oldest person.
Code:
Code:
```py
```py
image = image_generator("A portrait of John Doe, a 55-year-old man living in Canada.")
image = image_generator("A portrait of John Doe, a 55-year-old man living in Canada.")
...
@@ -303,7 +302,6 @@ final_answer(image)
...
@@ -303,7 +302,6 @@ final_answer(image)
Task: "What is the result of the following operation: 5 + 3 + 1294.678?"
Task: "What is the result of the following operation: 5 + 3 + 1294.678?"
Thought: I will use python code to compute the result of the operation and then return the final answer using the `final_answer` tool
Thought: I will use python code to compute the result of the operation and then return the final answer using the `final_answer` tool