1. Create the Ollama namespace, daemon set, and service
1. Create the Ollama namespace, deployment, and service
```bash
```bash
kubectl apply -f cpu.yaml
kubectl apply -f cpu.yaml
```
```
## (Optional) Hardware Acceleration
Hardware acceleration in Kubernetes requires NVIDIA's [`k8s-device-plugin`](https://github.com/NVIDIA/k8s-device-plugin) which is deployed in Kubernetes in form of daemonset. Follow the link for more details.
Once configured, create a GPU enabled Ollama deployment.
```bash
kubectl apply -f gpu.yaml
```
## Test
1. Port forward the Ollama service to connect and use it locally
1. Port forward the Ollama service to connect and use it locally
```bash
```bash
...
@@ -24,13 +36,3 @@
...
@@ -24,13 +36,3 @@
```bash
```bash
ollama run orca-mini:3b
ollama run orca-mini:3b
```
```
\ No newline at end of file
## (Optional) Hardware Acceleration
Hardware acceleration in Kubernetes requires NVIDIA's [`k8s-device-plugin`](https://github.com/NVIDIA/k8s-device-plugin). Follow the link for more details.
Once configured, create a GPU enabled Ollama deployment.
prompt=f"generate one realistically believable sample data set of a persons first name, last name, address in {country}, and phone number. Do not use common names. Respond using JSON. Key names should have no backslashes, values should use plain ascii with no special characters."
prompt=f"generate one realistically believable sample data set of a persons first name, last name, address in {country}, and phone number. Do not use common names. Respond using JSON. Key names should have no backslashes, values should use plain ascii with no special characters."
@@ -15,7 +15,7 @@ async function characterGenerator() {
...
@@ -15,7 +15,7 @@ async function characterGenerator() {
ollama.setModel("stablebeluga2:70b-q4_K_M");
ollama.setModel("stablebeluga2:70b-q4_K_M");
constbio=awaitollama.generate(`create a bio of ${character} in a single long paragraph. Instead of saying '${character} is...' or '${character} was...' use language like 'You are...' or 'You were...'. Then create a paragraph describing the speaking mannerisms and style of ${character}. Don't include anything about how ${character} looked or what they sounded like, just focus on the words they said. Instead of saying '${character} would say...' use language like 'You should say...'. If you use quotes, always use single quotes instead of double quotes. If there are any specific words or phrases you used a lot, show how you used them. `);
constbio=awaitollama.generate(`create a bio of ${character} in a single long paragraph. Instead of saying '${character} is...' or '${character} was...' use language like 'You are...' or 'You were...'. Then create a paragraph describing the speaking mannerisms and style of ${character}. Don't include anything about how ${character} looked or what they sounded like, just focus on the words they said. Instead of saying '${character} would say...' use language like 'You should say...'. If you use quotes, always use single quotes instead of double quotes. If there are any specific words or phrases you used a lot, show how you used them. `);
constthecontents=`FROM llama2\nSYSTEM """\n${bio.response.replace(/(\r\n|\n|\r)/gm,"").replace('would','should')} All answers to questions should be related back to what you are most known for.\n"""`;
constthecontents=`FROM llama3\nSYSTEM """\n${bio.response.replace(/(\r\n|\n|\r)/gm,"").replace('would','should')} All answers to questions should be related back to what you are most known for.\n"""`;