Commits · 88bb9e332877dfbba40030c19570fdbe00f41a21 · OpenDAS / ollama

19 Aug, 2024 5 commits
- Adjust layout to bin+lib/ollama · 88bb9e33
  Daniel Hiltgen authored Aug 14, 2024
  
  88bb9e33
- Add windows cuda v12 + v11 support · 927d98a6
  Daniel Hiltgen authored Jul 12, 2024
  
  927d98a6
- Add Jetson cuda variants for arm · d470ebe7
  Daniel Hiltgen authored May 30, 2024
```
This adds new variants for arm64 specific to Jetson platforms
```
  d470ebe7
- Wire up ccache and pigz in the docker based build · c7bcb003
  Daniel Hiltgen authored Aug 09, 2024
```
This should help speed things up a little
```
  c7bcb003
- Refactor linux packaging · 74d45f01
  Daniel Hiltgen authored Jul 08, 2024
```
This adjusts linux to follow a similar model to windows with a discrete archive
(zip/tgz) to cary the primary executable, and dependent libraries. Runners are
still carried as payloads inside the main binary

Darwin retain the payload model where the go binary is fully self contained.
```
  74d45f01
20 Jul, 2024 1 commit

Adjust windows ROCm discovery · 283948c8

Daniel Hiltgen authored Jul 19, 2024

The v5 hip library returns unsupported GPUs which wont enumerate at
inference time in the runner so this makes sure we align discovery. The
gfx906 cards are no longer supported so we shouldn't compile with that
GPU type as it wont enumerate at runtime.

283948c8

11 Jul, 2024 1 commit
- llm: dont link cuda with compat libs (#5621) · efbf41ed
  Jeffrey Morgan authored Jul 10, 2024
  
  efbf41ed
10 Jul, 2024 2 commits
- remove `GGML_CUDA_FORCE_MMQ=on` from build (#5588) · 4e262eb2
  Jeffrey Morgan authored Jul 10, 2024
  
  4e262eb2
- Bump ROCm on windows to 6.1.2 · 1f50356e
  Daniel Hiltgen authored Jul 10, 2024
```
This also adjusts our algorithm to favor our bundled ROCm.
I've confirmed VRAM reporting still doesn't work properly so we
can't yet enable concurrency by default.
```
  1f50356e
08 Jul, 2024 1 commit
- Workaround broken ROCm p2p copy · 0bacb300
  Daniel Hiltgen authored Jul 05, 2024
```
Enable the build flag for llama.cpp to use CPU copy for multi-GPU scenarios.
```
  0bacb300
06 Jul, 2024 5 commits
- llm: add `-DBUILD_SHARED_LIBS=off` to common cpu cmake flags (#5520) · 4607c706
  Jeffrey Morgan authored Jul 06, 2024
  
  4607c706
- llm: statically link pthread and stdc++ dependencies in windows build · f1a379aa
  jmorganca authored Jul 06, 2024
  
  f1a379aa
- llm: add `GGML_STATIC` flag to windows static lib · 9ae14699
  jmorganca authored Jul 06, 2024
  
  9ae14699
- llm: add `COMMON_DARWIN_DEFS` to arm static build (#5513) · e0348d3f
  Jeffrey Morgan authored Jul 05, 2024
  
  e0348d3f
- llm: fix missing dylibs by restoring old build behavior on Linux and macOS (#5511) · 2cc854f8
  Jeffrey Morgan authored Jul 05, 2024
```
* Revert "fix cmake build (#5505)"

This reverts commit 4fd5f352.

* llm: fix missing dylibs by restoring old build behavior

* crlf -> lf
```
  2cc854f8
05 Jul, 2024 2 commits
- fix cmake build (#5505) · 4fd5f352
  Jeffrey Morgan authored Jul 05, 2024
  
  4fd5f352
- update llama.cpp submodule to `d7fd29f` (#5475) · 8f8e736b
  Jeffrey Morgan authored Jul 05, 2024
  
  8f8e736b
17 Jun, 2024 4 commits

Add back lower level parallel flags · b0930626

Daniel Hiltgen authored Jun 17, 2024

nvcc supports parallelism (threads) and cmake + make can use -j,
while msbuild requires /p:CL_MPcount=8

b0930626

Revert "More parallelism on windows generate" · e890be48
Daniel Hiltgen authored Jun 17, 2024
```
This reverts commit 0577af98.
```
e890be48

Move libraries out of users path · b2799f11

Daniel Hiltgen authored Jun 15, 2024

We update the PATH on windows to get the CLI mapped, but this has
an unintended side effect of causing other apps that may use our bundled
DLLs to get terminated when we upgrade.

b2799f11

llm: update llama.cpp commit to `7c26775` (#4896) · 152fc202

Jeffrey Morgan authored Jun 17, 2024

* llm: update llama.cpp submodule to `7c26775`

* disable `LLAMA_BLAS` for now

* `-DLLAMA_OPENMP=off`

152fc202

15 Jun, 2024 1 commit
- More parallelism on windows generate · 0577af98
  Daniel Hiltgen authored Jun 13, 2024
```
Make the build faster
```
  0577af98
07 Jun, 2024 1 commit

Add ability to skip oneapi generate · ab8c929e

Daniel Hiltgen authored Jun 07, 2024

This follows the same pattern for cuda and rocm to allow
disabling the build even when we detect the dependent libraries

ab8c929e

31 May, 2024 1 commit
- speed up tests by only building static lib (#4740) · 7ca9605f
  Jeffrey Morgan authored May 30, 2024
  
  7ca9605f
24 May, 2024 1 commit
- support ollama run on Intel GPUs · fd5971be
  Wang,Zhe authored May 24, 2024
  
  fd5971be
15 May, 2024 1 commit
- Port cuda/rocm skip build vars to linux · c48c1d7c
  Daniel Hiltgen authored May 15, 2024
```
Windows already implements these, carry over to linux.
```
  c48c1d7c
27 Apr, 2024 2 commits
- Do not build AVX runners on ARM64 · 8a65717f
  Hernan Martinez authored Apr 26, 2024
  
  8a65717f
- Use architecture specific folders in the generate script · b438d485
  Hernan Martinez authored Apr 26, 2024
  
  b438d485
26 Apr, 2024 5 commits
- Fine grain control over windows generate steps · e4859c45
  Daniel Hiltgen authored Apr 26, 2024
```
This will speed up CI which already tries to only build static for unit tests
```
  e4859c45
- Fix target in gen_windows.ps1 · ed5fb088
  Daniel Hiltgen authored Apr 26, 2024
  
  ed5fb088
- Put back non-avx CPU build for windows · 421c878a
  Daniel Hiltgen authored Apr 26, 2024
  
  421c878a
- Refactor windows generate for more modular usage · 8671fded
  Daniel Hiltgen authored Apr 25, 2024
  
  8671fded
- Move cuda/rocm dependency gathering into generate script · 8feb97dc
  Daniel Hiltgen authored Apr 25, 2024
```
This will make it simpler for CI to accumulate artifacts from prior steps
```
  8feb97dc
25 Apr, 2024 1 commit
- Remove trailing spaces (#3889) · 5f73c087
  Roy Yang authored Apr 25, 2024
  
  5f73c087
23 Apr, 2024 1 commit

Move nested payloads to installer and zip file on windows · 058f6cd2

Daniel Hiltgen authored Apr 23, 2024

Now that the llm runner is an executable and not just a dll, more users are facing
problems with security policy configurations on windows that prevent users
writing to directories and then executing binaries from the same location.
This change removes payloads from the main executable on windows and shifts them
over to be packaged in the installer and discovered based on the executables location.
This also adds a new zip file for people who want to "roll their own" installation model.

058f6cd2

21 Apr, 2024 1 commit
- Update gen_windows.ps1 · 9c0db4cc
  Jeremy authored Apr 21, 2024
```
Fixed improper env references
```
  9c0db4cc
18 Apr, 2024 3 commits

Update gen_windows.ps1 · 6f18297b
Jeremy authored Apr 18, 2024
```
Forgot a " on the write-host
```
6f18297b

Update gen_windows.ps1 · 15016413

Jeremy authored Apr 18, 2024

Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS to customize GPU builds on Windows

15016413

Update gen_linux.sh · 440b7190

Jeremy authored Apr 18, 2024

Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS instead of OLLAMA_CUSTOM_GPU_DEFS

440b7190

17 Apr, 2024 1 commit
- add support for custom gpu build flags for llama.cpp · 52f5370c
  Jeremy authored Apr 17, 2024
  
  52f5370c