1. You choose a topic area (e.g., "news", "NVidia", "music", etc.).
2. Gets the most recent articles on that topic from various sources.
3. Uses Ollama to summarize each article.
4. Creates chunks of sentences from each article.
5. Uses Sentence Transformers to generate embeddings for each of those chunks.
6. You enter a question regarding the summaries shown.
7. Uses Sentence Transformers to generate an embedding for that question.
8. Uses the embedded question to find the most similar chunks.
9. Feeds all that to Ollama to generate a good answer to your question based on these news articles.
This example lets you pick from a few different topic areas, then summarize the most recent x articles for that topic. It then creates chunks of sentences from each article and then generates embeddings for each of those chunks.
## Running the Example
1. Ensure you have the `mistral-openorca` model installed:
The **chat** endpoint is one of two ways to generate text from an LLM with Ollama, and is introduced in version 0.1.14. At a high level, you provide the endpoint an array of objects with a role and content specified. Then with each output and prompt, you add more of those role/content objects, which builds up the history.
## Running the Example
1. Ensure you have the `llama3.2` model installed:
```bash
ollama pull llama3.2
```
2. Install the Python Requirements.
```bash
pip install-r requirements.txt
```
3. Run the example:
```bash
python client.py
```
## Review the Code
You can see in the **chat** function that actually calling the endpoint is done simply with:
With the **generate** endpoint, you need to provide a `prompt`. But with **chat**, you provide `messages`. And the resulting stream of responses includes a `message` object with a `content` field.
The final JSON object doesn't provide the full content, so you will need to build the content yourself.
In the **main** function, we collect `user_input` and add it as a message to our messages and that is passed to the chat function. When the LLM is done responding the output is added as another message.
## Next Steps
In this example, all generations are kept. You might want to experiment with summarizing everything older than 10 conversations to enable longer history with less context being used.
This is a simple example using the **Generate** endpoint.
## Running the Example
1. Ensure you have the `stablelm-zephyr` model installed:
```bash
ollama pull stablelm-zephyr
```
2. Install the Python Requirements.
```bash
pip install-r requirements.txt
```
3. Run the example:
```bash
python client.py
```
## Review the Code
The **main** function simply asks for input, then passes that to the generate function. The output from generate is then passed back to generate on the next run.
The **generate** function uses `requests.post` to call `/api/generate`, passing the model, prompt, and context. The `generate` endpoint returns a stream of JSON blobs that are then iterated through, looking for the response values. That is then printed out. The final JSON object includes the full context of the conversation so far, and that is the return value from the function.
constlocationString=location?` at ${location}`:``;
console.log(`You have an event: ${nameString}${dateString}${locationString}`)
}
// function to be called on addresses
functionreportAddresses(address){
for(constfieldinaddress){
if(address[field]){
if(field==="city"){
constcity=address.city;
conststate=address.state?`, ${address.state}`:'';
constzip=address.zip?` ${address.zip}`:'';
console.log(`${city}${state}${zip}`);
break;
}else{
console.log(`${address[field]}`);
}
}
}
console.log(``);
}
asyncfunctionmain(){
constollama=newOllama();
constsystemprompt=`You will be given a text along with a prompt and a schema. You will have to extract the information requested in the prompt from the text and generate output in JSON observing the schema provided. If the schema shows a type of integer or number, you must only show a integer for that field. A string should always be a valid string. If a value is unknown, leave it empty. Output the JSON with extra spaces to ensure that it pretty prints.`
constschema={
"eventsQuantity":{
"type":"integer",
"description":"The number of events in the source text"
},
"addressesQuantity":{
"type":"integer",
"description":"The number of addresses in the source text"
},
"events":[{
name:{
"type":"string",
description:"Name of the event"
},
"date":{
"type":"string",
"description":"Date of the event"
},
"location":{
"type":"string",
"description":"Location of the event"
},
"extraInfo":{
"type":"string",
"description":"Any extra information that is provided about the event."
}
}],
"people":[{
"name":{
"type":"string",
"description":"Name of the person"
},
"company":{
"type":"string",
"description":"Name of the company where they work"
},
"street":{
"type":"string",
"description":"Street address of the person or company. This is only the street name and the numerical address. Do not include city, state, or zip of the address in this field."
},
"city":{
"type":"string",
"description":"City portion of the address of the person or company"
},
"state":{
"type":"string",
"description":"State portion of the address of the person or company"
},
"zip":{
"type":"string",
"description":"Zip code of the person or company"
},
"extraInfo":{
"type":"string",
"description":"Any extra information that is provided about the location."
constprompt=`The source text is a series of emails that have been put into a single file. They are separated by three dashes. Review the source text and determine the full address of the person sending each of the emails as well as any events that we need to track. If they provide a company address use that. If any extra info is provided, such as a description of the place, or a floor, add it to extraInfo. The first field in the address JSON is quantity of events and should be set to the number of events tracked and the second field should be set to the number of addresses tracked in the file. Don't stuff an event into the output that isn't an event. Only add data to the mostly appropriate field. Don't make up fields that aren't in the schema. If there isn't a value for a field, use null. Output should be in JSON.\n\nSchema: \n${JSON.stringify(schema,null,2)}\n\nSource Text:\n${textcontent}`
awaitollama.setModel("neural-chat");
ollama.setSystemPrompt(systemprompt);
ollama.setJSONFormat(true);
constdata=awaitollama.generate(prompt);
constoutput=JSON.parse(data.output);
constevents=output.events;
constaddresses=output.people;
console.log(`Here are your ${output.eventsQuantity} events:`);
// Set the system prompt to prepare the model to receive a prompt and a schema and set some rules for the output.
constsystemprompt=`You will be given a text along with a prompt and a schema. You will have to extract the information requested in the prompt from the text and generate output in JSON observing the schema provided. If the schema shows a type of integer or number, you must only show a integer for that field. A string should always be a valid string. If a value is unknown, leave it empty. Output the JSON with extra spaces to ensure that it pretty prints.`
constschema={
"people":[{
"name":{
"type":"string",
"description":"Name of the person"
},
"title":{
"type":"string",
"description":"Title of the person"
}
}],
}
// Depending on the model chosen, you may be limited by the size of the context window, so limit the context to 2000 words.
constprompt=`Review the source text and determine the 10 most important people to focus on. Then extract the name and title for those people. Output should be in JSON.\n\nSchema: \n${JSON.stringify(schema,null,2)}\n\nSource Text:\n${textcontent}`
awaitollama.setModel("neural-chat");
ollama.setSystemPrompt(systemprompt);
// setJSONFormat is the equivalent of setting 'format: json' in the API
thanks for letting me know that you are going to come today, November 16, for my tea party. My address is 123 Falk St on Bainbridge Island. I live in the house with the red door. I will be home all day so just come by whenever you want.
Fred
---
Great, send the check to our office at 1917 1st St, Seattle, WA 98101. I will let you know when we receive it.
Mark Richardson
Big Corp
---
We are looking forward to seeing you at our Local AI Meetup. It will be held on December 3. It will be at the offices of Enormous Co. Our address is 344 1st Ave, Seattle, WA 98101. We will be meeting in the conference room on the 3rd floor.
One of the features added to some models is 'function calling'. It's a bit of a confusing name. It's understandable if you think that means the model can call functions, but that's not what it means. Function calling simply means that the output of the model is formatted in JSON, using a preconfigured schema, and uses the expected types. Then your code can use the output of the model and call functions with it. Using the JSON format in Ollama, you can use any model for function calling.
The two examples provided can extract information out of the provided texts. The first example uses the first couple of chapters from War and Peace by Lev Nikolayevich Tolstoy, and extracts the names and titles of the characters introduced in the story. The second example uses a more complicated schema to pull out addresses and event information from a series of emails.
## Running the examples
1. Clone this repo and navigate to the `examples/typescript-functioncalling` directory.
2. Install the dependencies with `npm install`.
3. Review the `wp.txt` file.
4. Run `tsx extractwp.ts`.
5. Review the `info.txt` file.
6. Run `tsx extractemail.ts`.
## Review the Code
Both examples do roughly the same thing with different source material. They both use the same system prompt, which tells the model to expect some instructions and a schema. Then we inject the schema into the prompt and generate an answer.
The first example, `extractwp.ts`, outputs the resulting JSON to the console, listing the characters introduced at the start of War and Peace. The second example, `extractemail.ts`, is a bit more complicated, extracting two different types of information: addresses and events. It outputs the results to a JSON blob, then the addresses are handed off to one function called `reportAddresses` and the events are handed off to another function called `reportEvents`.
Notice that both examples are using the model from Intel called `neural-chat`. This is not a model tuned for function calling, yet it performs very well at this task.
## Next Steps
Try exporting some of your real emails to the input file and seeing how well the model does. Try pointing the first example at other books. You could even have it cycle through all the sections and maybe add up the number of times any character is seen throughout the book, determining the most important characters. You can also try out different models.
This example demonstrates how one would create a set of 'mentors' you can have a conversation with. The mentors are generated using the `character-generator.ts` file. This will use **Stable Beluga 70b** to create a bio and list of verbal ticks and common phrases used by each person. Then `mentors.ts` will take a question, and choose three of the 'mentors' and start a conversation with them. Occasionally, they will talk to each other, and other times they will just deliver a set of monologues. It's fun to see what they do and say.
## Usage
1. Add llama3 to have the mentors ask your questions:
```bash
ollama pull llama3
```
2. Install prerequisites:
```bash
npm install
```
3. Ask a question:
```bash
npm start "what is a jackalope"
```
You can also add your own character to be chosen at random when you ask a question.
1. Make sure you have the right model installed:
```bash
ollama pull stablebeluga2:70b-q4_K_M
```
2. Create a new character:
```bash
npm run charactergen "Lorne Greene"
```
You can choose any well-known person you like. This example will create `lornegreene/Modelfile`.
`username` is whatever name you set up when you signed up at [https://ollama.com/signup](https://ollama.com/signup).
4. To add this to your mentors, you will have to update the code as follows. On line 8 of `mentors.ts`, add an object to the array, replacing `<username>` with the username you used above.
```bash
{ns: "<username>", char: "Lorne Greene"}
```
## Review the Code
There are two scripts you can run in this example. The first is the main script to ask the mentors a question. The other one lets you generate a character to add to the mentors. Both scripts are mostly about adjusting the prompts at each inference stage.
### mentors.ts
In the **main** function, it starts by generating a list of mentors. This chooses 3 from a list of interesting characters. Then we ask for a question, and then things get interesting. We set the prompt for each of the 3 mentors a little differently. And the 2nd and 3rd mentors see what the previous folks said. The other functions in mentors sets the prompts for each mentor.
### character-generator.ts
**Character Generator** simply customizes the prompt to build a character profile for any famous person. And most of the script is just tweaking the prompt. This uses Stable Beluga 2 70b parameters. The 70b models tend to do better writing a bio about a character than smaller models, and Stable Beluga seemed to do better than Llama 2. Since this is used at development time for the characters, it doesn't affect the runtime of asking the mentors for their input.
constbio=awaitollama.generate(`create a bio of ${character} in a single long paragraph. Instead of saying '${character} is...' or '${character} was...' use language like 'You are...' or 'You were...'. Then create a paragraph describing the speaking mannerisms and style of ${character}. Don't include anything about how ${character} looked or what they sounded like, just focus on the words they said. Instead of saying '${character} would say...' use language like 'You should say...'. If you use quotes, always use single quotes instead of double quotes. If there are any specific words or phrases you used a lot, show how you used them. `);
constthecontents=`FROM llama3\nSYSTEM """\n${bio.response.replace(/(\r\n|\n|\r)/gm,"").replace('would','should')} All answers to questions should be related back to what you are most known for.\n"""`;