- 23 Aug, 2024 3 commits
-
-
Patrick Devine authored
-
Daniel Hiltgen authored
During rebasing, the ordering was inverted causing the cuda version selection logic to break, with driver version being evaluated as zero incorrectly causing a downgrade to v11.
-
Daniel Hiltgen authored
Define changed recently and this slipped through the cracks with the old name.
-
- 22 Aug, 2024 1 commit
-
-
Daniel Hiltgen authored
* Fix embeddings memory corruption The patch was leading to a buffer overrun corruption. Once removed though, parallism in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count. To work around this, only use slot 0 for embeddings. * Fix embed integration test assumption The token eval count has changed with recent llama.cpp bumps (0.3.5+)
-
- 21 Aug, 2024 8 commits
-
-
Michael Yang authored
convert: update llama conversion for llama3.1
-
Michael Yang authored
-
Michael Yang authored
convert gemma2
-
Michael Yang authored
convert bert model from safetensors
-
Michael Yang authored
fix: chmod new layer to 0o644 when creating it
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
- 20 Aug, 2024 1 commit
-
-
Daniel Hiltgen authored
We're over budget for github's maximum release artifact size with rocm + 2 cuda versions. This splits rocm back out as a discrete artifact, but keeps the layout so it can be extracted into the same location as the main bundle.
-
- 19 Aug, 2024 17 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Fix overlapping artifact name on CI
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Cuda v12
-
Daniel Hiltgen authored
Override numParallel in pickBestPartialFitByLibrary() only if unset.
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Based on compute capability and driver version, pick v12 or v11 cuda variants.
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
This adds new variants for arm64 specific to Jetson platforms
-
Daniel Hiltgen authored
This should help speed things up a little
-
Daniel Hiltgen authored
This adjusts linux to follow a similar model to windows with a discrete archive (zip/tgz) to cary the primary executable, and dependent libraries. Runners are still carried as payloads inside the main binary Darwin retain the payload model where the go binary is fully self contained.
-
Jeffrey Morgan authored
-
- 18 Aug, 2024 2 commits
-
-
Richard Lyons authored
-
Richard Lyons authored
-
- 17 Aug, 2024 1 commit
-
-
Richard Lyons authored
-
- 16 Aug, 2024 1 commit
-
-
zwwhdls authored
Signed-off-by:zwwhdls <zww@hdls.me>
-
- 15 Aug, 2024 4 commits
-
-
Daniel Hiltgen authored
fix: Add tooltip to system tray icon
-
eust-w authored
- Updated setIcon method to include tooltip text for the system tray icon. - Added NIF_TIP flag and set the tooltip text using UTF16 encoding. Resolves: #6372
-
Michael Yang authored
fix: noprune on pull
-
Michael Yang authored
-
- 14 Aug, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-