Commits · fda0d3be5224b59a4b4b031e18c89adca71657ed · OpenDAS / ollama

12 Sep, 2024 2 commits

Use GOARCH for build dirs (#6779) · fda0d3be
Daniel Hiltgen authored Sep 12, 2024
```
Corrects x86_64 vs amd64 discrepancy
```
fda0d3be

Optimize container images for startup (#6547) · cd5c8f64

Daniel Hiltgen authored Sep 12, 2024

* Optimize container images for startup

This change adjusts how to handle runner payloads to support
container builds where we keep them extracted in the filesystem.
This makes it easier to optimize the cpu/cuda vs cpu/rocm images for
size, and should result in faster startup times for container images.

* Refactor payload logic and add buildx support for faster builds

* Move payloads around

* Review comments

* Converge to buildx based helper scripts

* Use docker buildx action for release

cd5c8f64

04 Sep, 2024 1 commit
- llm: update llama.cpp commit to 8962422 (#6618) · 5e2653f9
  Jeffrey Morgan authored Sep 03, 2024
  
  5e2653f9
29 Aug, 2024 1 commit
- remove any unneeded build artifacts · 11018196
  Michael Yang authored Aug 29, 2024
  
  11018196
23 Aug, 2024 1 commit
- llm: Align cmake define for cuda no peer copy (#6455) · 0b03b9c3
  Daniel Hiltgen authored Aug 23, 2024
```
Define changed recently and this slipped through the cracks with the old
name.
```
  0b03b9c3
20 Aug, 2024 1 commit

Split rocm back out of bundle (#6432) · a017cf2f

Daniel Hiltgen authored Aug 20, 2024

We're over budget for github's maximum release artifact size with rocm + 2 cuda
versions. This splits rocm back out as a discrete artifact, but keeps the layout so it can
be extracted into the same location as the main bundle.

a017cf2f

19 Aug, 2024 6 commits
- Review comments · f9e31da9
  Daniel Hiltgen authored Aug 15, 2024
  
  f9e31da9
- Adjust layout to bin+lib/ollama · 88bb9e33
  Daniel Hiltgen authored Aug 14, 2024
  
  88bb9e33
- Add windows cuda v12 + v11 support · 927d98a6
  Daniel Hiltgen authored Jul 12, 2024
  
  927d98a6
- Add Jetson cuda variants for arm · d470ebe7
  Daniel Hiltgen authored May 30, 2024
```
This adds new variants for arm64 specific to Jetson platforms
```
  d470ebe7
- Wire up ccache and pigz in the docker based build · c7bcb003
  Daniel Hiltgen authored Aug 09, 2024
```
This should help speed things up a little
```
  c7bcb003
- Refactor linux packaging · 74d45f01
  Daniel Hiltgen authored Jul 08, 2024
```
This adjusts linux to follow a similar model to windows with a discrete archive
(zip/tgz) to cary the primary executable, and dependent libraries. Runners are
still carried as payloads inside the main binary

Darwin retain the payload model where the go binary is fully self contained.
```
  74d45f01
20 Jul, 2024 1 commit

Adjust windows ROCm discovery · 283948c8

Daniel Hiltgen authored Jul 19, 2024

The v5 hip library returns unsupported GPUs which wont enumerate at
inference time in the runner so this makes sure we align discovery. The
gfx906 cards are no longer supported so we shouldn't compile with that
GPU type as it wont enumerate at runtime.

283948c8

11 Jul, 2024 1 commit
- llm: dont link cuda with compat libs (#5621) · efbf41ed
  Jeffrey Morgan authored Jul 10, 2024
  
  efbf41ed
10 Jul, 2024 2 commits
- remove `GGML_CUDA_FORCE_MMQ=on` from build (#5588) · 4e262eb2
  Jeffrey Morgan authored Jul 10, 2024
  
  4e262eb2
- Bump ROCm on windows to 6.1.2 · 1f50356e
  Daniel Hiltgen authored Jul 10, 2024
```
This also adjusts our algorithm to favor our bundled ROCm.
I've confirmed VRAM reporting still doesn't work properly so we
can't yet enable concurrency by default.
```
  1f50356e
08 Jul, 2024 1 commit
- Workaround broken ROCm p2p copy · 0bacb300
  Daniel Hiltgen authored Jul 05, 2024
```
Enable the build flag for llama.cpp to use CPU copy for multi-GPU scenarios.
```
  0bacb300
06 Jul, 2024 5 commits
- llm: add `-DBUILD_SHARED_LIBS=off` to common cpu cmake flags (#5520) · 4607c706
  Jeffrey Morgan authored Jul 06, 2024
  
  4607c706
- llm: statically link pthread and stdc++ dependencies in windows build · f1a379aa
  jmorganca authored Jul 06, 2024
  
  f1a379aa
- llm: add `GGML_STATIC` flag to windows static lib · 9ae14699
  jmorganca authored Jul 06, 2024
  
  9ae14699
- llm: add `COMMON_DARWIN_DEFS` to arm static build (#5513) · e0348d3f
  Jeffrey Morgan authored Jul 05, 2024
  
  e0348d3f
- llm: fix missing dylibs by restoring old build behavior on Linux and macOS (#5511) · 2cc854f8
  Jeffrey Morgan authored Jul 05, 2024
```
* Revert "fix cmake build (#5505)"

This reverts commit 4fd5f352.

* llm: fix missing dylibs by restoring old build behavior

* crlf -> lf
```
  2cc854f8
05 Jul, 2024 2 commits
- fix cmake build (#5505) · 4fd5f352
  Jeffrey Morgan authored Jul 05, 2024
  
  4fd5f352
- update llama.cpp submodule to `d7fd29f` (#5475) · 8f8e736b
  Jeffrey Morgan authored Jul 05, 2024
  
  8f8e736b
17 Jun, 2024 4 commits

Add back lower level parallel flags · b0930626

Daniel Hiltgen authored Jun 17, 2024

nvcc supports parallelism (threads) and cmake + make can use -j,
while msbuild requires /p:CL_MPcount=8

b0930626

Revert "More parallelism on windows generate" · e890be48
Daniel Hiltgen authored Jun 17, 2024
```
This reverts commit 0577af98.
```
e890be48

Move libraries out of users path · b2799f11

Daniel Hiltgen authored Jun 15, 2024

We update the PATH on windows to get the CLI mapped, but this has
an unintended side effect of causing other apps that may use our bundled
DLLs to get terminated when we upgrade.

b2799f11

llm: update llama.cpp commit to `7c26775` (#4896) · 152fc202

Jeffrey Morgan authored Jun 17, 2024

* llm: update llama.cpp submodule to `7c26775`

* disable `LLAMA_BLAS` for now

* `-DLLAMA_OPENMP=off`

152fc202

15 Jun, 2024 1 commit
- More parallelism on windows generate · 0577af98
  Daniel Hiltgen authored Jun 13, 2024
```
Make the build faster
```
  0577af98
07 Jun, 2024 1 commit

Add ability to skip oneapi generate · ab8c929e

Daniel Hiltgen authored Jun 07, 2024

This follows the same pattern for cuda and rocm to allow
disabling the build even when we detect the dependent libraries

ab8c929e

31 May, 2024 1 commit
- speed up tests by only building static lib (#4740) · 7ca9605f
  Jeffrey Morgan authored May 30, 2024
  
  7ca9605f
24 May, 2024 1 commit
- support ollama run on Intel GPUs · fd5971be
  Wang,Zhe authored May 24, 2024
  
  fd5971be
15 May, 2024 1 commit
- Port cuda/rocm skip build vars to linux · c48c1d7c
  Daniel Hiltgen authored May 15, 2024
```
Windows already implements these, carry over to linux.
```
  c48c1d7c
27 Apr, 2024 2 commits
- Do not build AVX runners on ARM64 · 8a65717f
  Hernan Martinez authored Apr 26, 2024
  
  8a65717f
- Use architecture specific folders in the generate script · b438d485
  Hernan Martinez authored Apr 26, 2024
  
  b438d485
26 Apr, 2024 5 commits
- Fine grain control over windows generate steps · e4859c45
  Daniel Hiltgen authored Apr 26, 2024
```
This will speed up CI which already tries to only build static for unit tests
```
  e4859c45
- Fix target in gen_windows.ps1 · ed5fb088
  Daniel Hiltgen authored Apr 26, 2024
  
  ed5fb088
- Put back non-avx CPU build for windows · 421c878a
  Daniel Hiltgen authored Apr 26, 2024
  
  421c878a
- Refactor windows generate for more modular usage · 8671fded
  Daniel Hiltgen authored Apr 25, 2024
  
  8671fded
- Move cuda/rocm dependency gathering into generate script · 8feb97dc
  Daniel Hiltgen authored Apr 25, 2024
```
This will make it simpler for CI to accumulate artifacts from prior steps
```
  8feb97dc