Commits · b330c830d3ea1f38facb7132d7018ca842010e24 · OpenDAS / ollama

16 Sep, 2024 1 commit
- readme: add vim-intelligence-bridge to Terminal section (#6818) · b330c830
  Pepo authored Sep 15, 2024
  
  b330c830
15 Sep, 2024 1 commit
- readme: add Obsidian Quiz Generator plugin to community integrations (#6789) · d889c6fd
  Edward Cui authored Sep 14, 2024
  
  d889c6fd
13 Sep, 2024 1 commit
- Fix incremental builds on linux (#6780) · 56b9af33
  Daniel Hiltgen authored Sep 13, 2024
```
scripts: fix incremental builds on linux or similar
```
  56b9af33
12 Sep, 2024 6 commits
- Use GOARCH for build dirs (#6779) · fda0d3be
  Daniel Hiltgen authored Sep 12, 2024
```
Corrects x86_64 vs amd64 discrepancy
```
  fda0d3be
- Optimize container images for startup (#6547) · cd5c8f64
  Daniel Hiltgen authored Sep 12, 2024
```
* Optimize container images for startup

This change adjusts how to handle runner payloads to support
container builds where we keep them extracted in the filesystem.
This makes it easier to optimize the cpu/cuda vs cpu/rocm images for
size, and should result in faster startup times for container images.

* Refactor payload logic and add buildx support for faster builds

* Move payloads around

* Review comments

* Converge to buildx based helper scripts

* Use docker buildx action for release
```
  cd5c8f64
- examples: updated requirements.txt for privategpt example · fef257c5
  dcasota authored Sep 12, 2024
  
  fef257c5
- examples: polish loganalyzer example (#6744) · d066d9b8
  Adrian Cole authored Sep 12, 2024
  
  d066d9b8
- readme: add ollama_moe to community integrations (#6752) · 5a00dc9f
  RAPID ARCHITECT authored Sep 11, 2024
  
  5a00dc9f
- Merge pull request #6767 from ollama/jessegross/bug_6707 · c354e878
  Jesse Gross authored Sep 11, 2024
```
runner: Flush pending responses before returning
```
  c354e878
11 Sep, 2024 6 commits

runner: Flush pending responses before returning · 93ac3760

Jesse Gross authored Sep 11, 2024

If there are any pending reponses (such as from potential stop
tokens) then we should send them back before ending the sequence.
Otherwise, we can be missing tokens at the end of a response.

Fixes #6707

93ac3760

add "stop" command (#6739) · abed273d
Patrick Devine authored Sep 11, 2024

abed273d
Merge pull request #6762 from ollama/mxyng/show-output · 03439262
Michael Yang authored Sep 11, 2024
```
refactor show ouput
```
03439262
refactor show ouput · ecab6f1c
Michael Yang authored Sep 11, 2024
```
fixes line wrapping on long texts
```
ecab6f1c
readme: add QodeAssist to community integrations (#6754) · 7d690082
Petr Mironychev authored Sep 11, 2024

7d690082

Verify permissions for AMD GPU (#6736) · 9246e6dd

Daniel Hiltgen authored Sep 11, 2024

This adds back a check which was lost many releases back to verify /dev/kfd permissions
which when lacking, can lead to confusing failure modes of:
"rocBLAS error: Could not initialize Tensile host: No devices found"

This implementation does not hard fail the serve command but instead will fall back to CPU
with an error log. In the future we can include this in the GPU discovery UX to show
detected but unsupported devices we discovered.

9246e6dd

10 Sep, 2024 5 commits
- Merge pull request #6732 from ollama/mxyng/debug-proxy · 735a0ca2
  Michael Yang authored Sep 10, 2024
```
add *_proxy to env map for debugging
```
  735a0ca2
- add *_proxy for debugging · dddb72e0
  Michael Yang authored Sep 10, 2024
  
  dddb72e0
- docs: update examples to use llama3.1 (#6718) · 83a9b527
  Jeffrey Morgan authored Sep 09, 2024
  
  83a9b527
- Quiet down dockers new lint warnings (#6716) · 4a8069f9
  Daniel Hiltgen authored Sep 09, 2024
```
* Quiet down dockers new lint warnings

Docker has recently added lint warnings to build.  This cleans up those warnings.

* Fix go lint regression
```
  4a8069f9
- catch when model vocab size is set correctly (#6714) · 84b84ce2
  Patrick Devine authored Sep 09, 2024
  
  84b84ce2
08 Sep, 2024 2 commits
- readme: add crewAI to community integrations (#6699) · bb6a086d
  Jeffrey Morgan authored Sep 08, 2024
  
  bb6a086d
- readme: add crewAI with mesop to community integrations · 30c8f201
  RAPID ARCHITECT authored Sep 08, 2024
  
  30c8f201
07 Sep, 2024 3 commits
- openai: align chat temperature and frequency_penalty options with completion (#6688) · 06d4fba8
  frob authored Sep 07, 2024
  
  06d4fba8
- docs: improve linux install documentation (#6683) · 108fb6c1
  Jeffrey Morgan authored Sep 06, 2024
```
Includes small improvements to document layout and code blocks
```
  108fb6c1
- openai: don't scale temperature or frequency_penalty (#6514) · da915345
  Yaroslav authored Sep 07, 2024
  
  da915345
06 Sep, 2024 5 commits
- readme: add Archyve to community integrations (#6680) · 8a027bc4
  nickthecook authored Sep 06, 2024
  
  8a027bc4
- readme: add Plasmoid Ollama Control to community integrations (#6681) · 5446903f
  imoize authored Sep 07, 2024
  
  5446903f
- Improve logging on GPU too small (#6666) · 56318fb3
  Daniel Hiltgen authored Sep 06, 2024
```
When we determine a GPU is too small for any layers, it's not always clear why.
This will help troubleshoot those scenarios.
```
  56318fb3
- openai: fix "presence_penalty" typo and add test (#6665) · fe91d7ff
  frob authored Sep 06, 2024
  
  fe91d7ff
- Fix gemma2 2b conversion (#6645) · 608e87bf
  Patrick Devine authored Sep 05, 2024
  
  608e87bf
05 Sep, 2024 10 commits
- Document uninstall on windows (#6663) · 48685c6e
  Daniel Hiltgen authored Sep 05, 2024
  
  48685c6e
- Revert "Detect running in a container (#6495)" (#6662) · 9565fa64
  Daniel Hiltgen authored Sep 05, 2024
```
This reverts commit a60d9b89.
```
  9565fa64
- llm: make load time stall duration configurable via OLLAMA_LOAD_TIMEOUT · 67190976
  Daniel Hiltgen authored Sep 05, 2024
```
With the new very large parameter models, some users are willing to wait for
a very long time for models to load.
```
  67190976
- Introduce GPU Overhead env var (#5922) · b05c9e83
  Daniel Hiltgen authored Sep 05, 2024
```
Provide a mechanism for users to set aside an amount of VRAM on each GPU
to make room for other applications they want to start after Ollama, or workaround
memory prediction bugs
```
  b05c9e83
- Detect running in a container (#6495) · a60d9b89
  Daniel Hiltgen authored Sep 05, 2024
  
  a60d9b89
- Merge pull request #6260 from ollama/mxyng/mem · bf612cd6
  Michael Yang authored Sep 05, 2024
```
llama3.1 memory
```
  bf612cd6
- readme: add AiLama to the list of community integrations (#4957) · ef98e561
  Zeyo authored Sep 06, 2024
  
  ef98e561
- Update gpu.md: Add RTX 3050 Ti and RTX 3050 Ti (#5888) · 5f944baa
  Michael authored Sep 05, 2024
```
* Update gpu.md

    Seems strange that the laptop versions of 3050 and 3050 Ti would be supported but not the non-notebook, but this is what the page (https://developer.nvidia.com/cuda-gpus

) says.
Signed-off-by: bean5 <2052646+bean5@users.noreply.github.com>

* Update gpu.md

Remove notebook reference

---------
Signed-off-by: bean5 <2052646+bean5@users.noreply.github.com>
```
  5f944baa
- server: fix blob download when receiving a 200 response (#6656) · 6fc9d227
  Tobias Heinze authored Sep 05, 2024
  
  6fc9d227
- readme: add Gentoo package manager entry to community integrations (#5714) · f27c00d8
  Vitaly Zdanevich authored Sep 05, 2024
  
  f27c00d8