- 26 Apr, 2024 9 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Fix exe name for zip packaging on windows
-
Daniel Hiltgen authored
The zip file encodes the OS and architecture, so keep the short exe name
-
Daniel Hiltgen authored
Refactor windows generate for more modular usage
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Move cuda/rocm dependency gathering into generate script
-
Daniel Hiltgen authored
This will make it simpler for CI to accumulate artifacts from prior steps
-
Daniel Hiltgen authored
Fix release CI
-
Daniel Hiltgen authored
download-artifact path was being used incorrectly. It is where to extract the zip not the files in the zip to extract. Default is workspace dir which is what we want, so omit it
-
- 25 Apr, 2024 9 commits
-
-
Michael Yang authored
only count output tensors
-
Daniel Hiltgen authored
Improve mac parallel performance
-
Jeffrey Morgan authored
* reload model if `num_gpu` changes * dont reload on -1 * fix tests
-
Jeffrey Morgan authored
* llm: limit generation to 10x context size to avoid run on generations * add comment * simplify condition statement
-
Michael Yang authored
-
Daniel Hiltgen authored
-
jmorganca authored
-
Roy Yang authored
-
Daniel Hiltgen authored
Move ggml loading to when attempting to fit
-
- 24 Apr, 2024 18 commits
-
-
Bryce Reitano authored
-
Bryce Reitano authored
-
Bryce Reitano authored
-
Michael Yang authored
update copy handler to use model.Name
-
Michael Yang authored
-
Michael Yang authored
fix: from blob
-
Michael Yang authored
-
Michael Yang authored
-
Blake Mizerany authored
-
Daniel Hiltgen authored
AMD gfx patch rev is hex
-
Daniel Hiltgen authored
Report errors on server lookup instead of path lookup failure
-
Daniel Hiltgen authored
Correctly handle gfx90a discovery
-
Patrick Devine authored
-
Patrick Devine authored
-
Patrick Devine authored
-
Blake Mizerany authored
This allows users of a valid Digest to know it has a minimum of 2 characters in the hash part for use when sharding. This is a reasonable restriction as the hash part is a SHA256 hash which is 64 characters long, which is the common hash used. There is no anticipation of using a hash with less than 2 characters. Also, add MustParseDigest. Also, replace Digest.Type with Digest.Split for getting both the type and hash parts together, which is most the common case when asking for either.
-
Daniel Hiltgen authored
Add back memory escape valve
-
Daniel Hiltgen authored
If we get our predictions wrong, this can be used to set a lower memory limit as a workaround. Recent multi-gpu refactoring accidentally removed it, so this adds it back.
-
- 23 Apr, 2024 4 commits
-
-
Daniel Hiltgen authored
Move nested payloads to installer and zip file on windows
-
Daniel Hiltgen authored
Give the go routine a moment to deliver the expired event
-
Daniel Hiltgen authored
Now that the llm runner is an executable and not just a dll, more users are facing problems with security policy configurations on windows that prevent users writing to directories and then executing binaries from the same location. This change removes payloads from the main executable on windows and shifts them over to be packaged in the installer and discovered based on the executables location. This also adds a new zip file for people who want to "roll their own" installation model.
-
Daniel Hiltgen authored
Detect and recover if runner removed
-