Commits · ffe35490645bcde264866c4d3d435c4c3cc5d34c · OpenDAS / ollama

23 Dec, 2024 3 commits
- readme: add IntelliBar to community integrations (#7950) · ffe35490
  Emanuil Rusev authored Dec 23, 2024
  
  ffe35490
- server: reuse InvalidModelNameErrMsg type (#8163) · 928de905
  湛露先生 authored Dec 23, 2024
  
  928de905
- readme: add Perplexica to community-integrations (#8198) · 36aea615
  ItzCrazyKns authored Dec 23, 2024
  
  36aea615
22 Dec, 2024 1 commit
- fix crash bug with /save when quotes are used (#8208) · dd352ab2
  Patrick Devine authored Dec 21, 2024
  
  dd352ab2
20 Dec, 2024 2 commits
- remove tutorials.md which pointed to removed tutorials (#8189) · d8bab8ea
  Patrick Devine authored Dec 20, 2024
  
  d8bab8ea
- update golang.org/x dependencies (#8172) · 9ab62eb9
  Squishedmac authored Dec 20, 2024
  
  9ab62eb9
19 Dec, 2024 1 commit

llama: test key order preservation in schema_to_grammar (#8078) · 290cf204

Parth Sareen authored Dec 18, 2024

This change adds a test to catch a regression in schema_to_grammar where
the order of keys in the JSON schema is not preserved in the generated
grammar, which is critical for step-by-step reasoning.

290cf204

18 Dec, 2024 1 commit
- scripts: sign renamed macOS binary (#8131) · a72f2dce
  Jeffrey Morgan authored Dec 17, 2024
  
  a72f2dce
17 Dec, 2024 6 commits

llama: Ensure KV cache is fully defragmented. · 08a832b4

Jesse Gross authored Dec 12, 2024

Sometimes the KV cache requires defragmentation even without
triggering the threshold heuristic. In this case, decoding
will not being able to find a KV cache slot. This is particularly
difficult for the caller to handle if it happens in between
ubatches. To avoid this, we should immediately trigger a defrag.

In addition, a heavily fragmented cache can require more than
max_moves to defragment. Currently, we stop when we hit the limit
but this can leave a cache that still does not have adequate space
even after defragmentation is triggered. Instead, we should do
multiple batches of processing until everything is complete.

Fixes #7949

08a832b4

llm: do not error on "null" format (#8139) · 2ddc32d5
Blake Mizerany authored Dec 17, 2024
```
This fixes another regression in the previous commit that fixed other
known bugs.
```
2ddc32d5
readme: change getting started guide link for pgai (#8119) · 2cde4b88
Jascha Beste authored Dec 17, 2024

2cde4b88

llm: do not silently fail for supplied, but invalid formats (#8130) · 87f0a49f

Blake Mizerany authored Dec 16, 2024

Changes in #8002 introduced fixes for bugs with mangling JSON Schemas.
It also fixed a bug where the server would silently fail when clients
requested invalid formats. It also, unfortunately, introduced a bug
where the server would reject requests with an empty format, which
should be allowed.

The change in #8127 updated the code to allow the empty format, but also
reintroduced the regression where the server would silently fail when
the format was set, but invalid.

This commit fixes both regressions. The server does not reject the empty
format, but it does reject invalid formats. It also adds tests to help
us catch regressions in the future.

Also, the updated code provides a more detailed error message when a
client sends a non-empty, but invalid format, echoing the invalid format
in the response.

This commits also takes the opportunity to remove superfluous linter
checks.

87f0a49f

llm: loosen format check to default to no format (#8127) · 0f06a6da
Jeffrey Morgan authored Dec 16, 2024

0f06a6da

darwin: restore multiple runners for x86 (#8125) · 8f805dd7

Daniel Hiltgen authored Dec 16, 2024

In 0.5.2 we simplified packaging to have avx only for macos x86. It looks like
there may still be some non-AVX systems out there, so this puts back the prior
logic of building no-AVX for the primary binary, and now 2 runners for avx and avx2.
These will be packaged in the App bundle only, so the stand-alone binary will now be
without AVX support on macos. On arm, we'll also see these runners reported
as available in the log, but they're dormant and will never be used at runtime.

8f805dd7

16 Dec, 2024 2 commits

readme: example/get started guide for pgai with Ollama (#8115) · 89d5e2f2
Michael authored Dec 16, 2024
```
readme: example/get started guide for pgai with Ollama
```
89d5e2f2

readme: add pgai to readme for semantic search (#8028) · 297ada6c

Jascha Beste authored Dec 16, 2024

* docs: switch around database integrations order and link to quickstart

* docs: link to blog post in example readme

* chore: link to main readme

* readme: removing example to link externally

readme: removing example to link externally so we don't have to keep this example up-to-date

---------

297ada6c

15 Dec, 2024 1 commit
- imageproc mllama refactor (#7537) · 8c9fb8eb
  Patrick Devine authored Dec 14, 2024
```
Refactor mllama image processing code, and add pixtral and qwen2vl
```
  8c9fb8eb
14 Dec, 2024 2 commits
- ci: be more aggressive on parallelism in build (#8102) · b75ccfc5
  Daniel Hiltgen authored Dec 14, 2024
  
  b75ccfc5
- llama: update vendor code to commit ba1cb19c (#8101) · 7a81daf0
  Jeffrey Morgan authored Dec 14, 2024
  
  7a81daf0
13 Dec, 2024 2 commits
- runner: switch logging back to stderr (#8091) · 60f75560
  Daniel Hiltgen authored Dec 13, 2024
```
This puts the low-level runner logging back on stderr for consistency with prior releases
```
  60f75560
- openai: return usage as final chunk for streams (#6784) · e28f2d49
  Anuraag (Rag) Agrawal authored Dec 13, 2024
```
* openai: return usage as final chunk for streams

---------
Co-authored-by: ParthSareen <parth.sareen@ollama.com>
```
  e28f2d49
12 Dec, 2024 2 commits
- llama: parse JSON schema using nlohmann::ordered_json to maintain ordering (#8071) · c2168505
  Pascal Patry authored Dec 12, 2024
  
  c2168505
- llama: enable JSON schema key ordering for generating grammars (#8055) · 18f6a98b
  Parth Sareen authored Dec 11, 2024
  
  18f6a98b
11 Dec, 2024 10 commits

server: more support for mixed-case model names (#8017) · b1fd7fef
Blake Mizerany authored Dec 11, 2024
```
Fixes #7944
```
b1fd7fef
ci: fix linux version (#8054) · 36d111e7
Daniel Hiltgen authored Dec 11, 2024
```
Pass through the version override so the makefiles use it
```
36d111e7

llama: preserve field order in user-defined JSON schemas (#8002) · 9039c821

Blake Mizerany authored Dec 11, 2024

Previously we decoded and re-encoded JSON schemas during validation,
which served no purpose since json.RawMessage already validates JSON
syntax. Worse, the re-encoding lost field ordering from the original
schema, which affects inference quality during step-by-step reasoning.

While fixing this ordering issue by using json.RawMessage directly,
testing revealed that schema_to_grammar (from llama.cpp) also fails to
preserve field order during grammar generation. This appears to be the
root cause of inference degradation.

This change prevents us from mangling the user's original schema order,
but we still need to address the ordering issue in schema_to_grammar.
That will be a separate change.

Updates #7978

9039c821

ci: fix artifact path prefix for missing windows payloads (#8052) · 581a4a55

Daniel Hiltgen authored Dec 11, 2024

upload-artifacts strips off leading common paths so when
the ./build/ artifacts were removed, the ./dist/windows-amd64
prefix became common and was stripped, making the
later download-artifacts place them in the wrong location

581a4a55

win: builtin arm runner (#8039) · cf4d7c52

Daniel Hiltgen authored Dec 11, 2024

The new build embeds the arm runner in the
main binary, so there is no longer a lib/ollama

cf4d7c52

ci: build dir changed (#8037) · 6a6328a5
Daniel Hiltgen authored Dec 10, 2024
```
Remove no longer relevant build log dir
```
6a6328a5
llama: update vendored code to commit 40c6d79f (#7875) · 527cc978
Jeffrey Morgan authored Dec 10, 2024

527cc978
go.mod: go 1.22.8 -> 1.23.4 (#8036) · a37f4a86
Blake Mizerany authored Dec 10, 2024

a37f4a86
Return err when NewHipLib() detect error. (#8012) · 46f74e0c
湛露先生 authored Dec 11, 2024
```
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
```
46f74e0c
readme: add AI summary helper plugin to community-integrations (#7202) · 7622ea21
Phil Wornath authored Dec 11, 2024

7622ea21

10 Dec, 2024 7 commits

readme: add Kangaroo, an AI-powered SQL admin tool to community integrations (#7948) · c5d39470
Tao Zuhong authored Dec 11, 2024

c5d39470
server: lowercase hostname for Host header check (#5851) · 757eeacc
frob authored Dec 10, 2024

757eeacc
readme: add aidful-ollama-model-delete to community integrations (#8024) · dd42acf7
Dr. Daniel Bender authored Dec 10, 2024

dd42acf7

Remove unused runner CpuFeatures (#8032) · b9ccb374

Daniel Hiltgen authored Dec 10, 2024

The final implementation of #7499 removed dynamic vector requirements
in favor of a simpler filename based model, and this was left over logic that
is no longer needed.

b9ccb374

all: fix typos in documentation, code, and comments (#7021) · abfdc471
Stefan Weil authored Dec 10, 2024

abfdc471
build: fix typo in override variable (#8031) · 82a02e18
Daniel Hiltgen authored Dec 10, 2024
```
The "F" was missing.
```
82a02e18

build: Make target improvements (#7499) · 4879a234

Daniel Hiltgen authored Dec 10, 2024

* llama: wire up builtin runner

This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build.  After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.

* build: Make target improvements

Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.

* Support customized CPU flags for runners

This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash.  If the user builds a customized set, we omit the naming
scheme and don't check for compatibility.  This avoids checking
requirements at runtime, so that logic has been removed as well.  This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.

* Use relative paths

If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.

* Remove payloads from main binary

* install: clean up prior libraries

This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.

4879a234