Commits · 44b466eeb2e42e9ce2852c69d7cddb7ebac5daf8 · OpenDAS / ollama

"vscode:/vscode.git/clone" did not exist on "5f633fcbc223fa901bf940f941cbbad09fffacd7"

29 Apr, 2025 1 commit
- config: update default context length to 4096 · 44b466ee
  Devon Rifkin authored Apr 28, 2025
  
  44b466ee
28 Apr, 2025 1 commit
- Revert "increase default context length to 4096 (#10364)" · dd93e1af
  Devon Rifkin authored Apr 28, 2025
```
This reverts commit 424f6486.
```
  dd93e1af
22 Apr, 2025 1 commit

increase default context length to 4096 (#10364) · 424f6486

Devon Rifkin authored Apr 22, 2025

* increase default context length to 4096

We lower the default numParallel from 4 to 2 and use these "savings" to
double the default context length from 2048 to 4096.

We're memory neutral in cases when we previously would've used
numParallel == 4, but we add the following mitigation to handle some
cases where we would have previously fallen back to 1x2048 due to low
VRAM: we decide between 2048 and 4096 using a runtime check, choosing
2048 if we're on a one GPU system with total VRAM of <= 4 GB. We
purposefully don't check the available VRAM because we don't want the
context window size to change unexpectedly based on the available VRAM.

We plan on making the default even larger, but this is a relatively
low-risk change we can make to quickly double it.

* fix tests

add an explicit context length so they don't get truncated. The code
that converts -1 from being a signal for doing a runtime check isn't
running as part of these tests.

* tweak small gpu message

* clarify context length default

also make it actually show up in `ollama serve --help`

424f6486

27 Mar, 2025 1 commit
- docs: make context length faq readable (#10006) · b816ff86
  Parth Sareen authored Mar 26, 2025
  
  b816ff86
13 Mar, 2025 1 commit
- docs: Add OLLAMA_ORIGINS for browser extension support (#9643) · 74b44fdf
  Bradley Erickson authored Mar 13, 2025
  
  74b44fdf
10 Mar, 2025 1 commit
- docs: Add OLLAMA_CONTEXT_LENGTH to FAQ. (#9545) · d8a5d96b
  frob authored Mar 10, 2025
  
  d8a5d96b
07 Feb, 2025 2 commits
- docs: improve syntax highlighting in code blocks (#8854) · b901a712
  Azis Alvriyanto authored Feb 08, 2025
  
  b901a712
- docs: include port in faq.md OLLAMA_HOST examples (#8905) · a400df48
  Leisure Linux authored Feb 07, 2025
  
  a400df48
03 Dec, 2024 1 commit
- llm: introduce k/v context quantization (vRAM improvements) (#6279) · 1bdab9fd
  Sam authored Dec 04, 2024
  
  1bdab9fd
25 Sep, 2024 1 commit
- update default model to llama3.2 (#6959) · 55ea963c
  Jeffrey Morgan authored Sep 25, 2024
  
  55ea963c
18 Sep, 2024 1 commit
- documentation for stopping a model (#6766) · 5804cf17
  Patrick Devine authored Sep 18, 2024
  
  5804cf17
10 Sep, 2024 1 commit
- docs: update examples to use llama3.1 (#6718) · 83a9b527
  Jeffrey Morgan authored Sep 09, 2024
  
  83a9b527
02 Sep, 2024 1 commit
- docs: update faq.md for OLLAMA_MODELS env var permissions (#6587) · 741affdf
  SnoopyTlion authored Sep 03, 2024
  
  741affdf
23 Aug, 2024 1 commit
- update faq · bb362caf
  Michael Yang authored Jul 02, 2024
  
  bb362caf
29 Jul, 2024 1 commit
- upate to `llama3.1` elsewhere in repo (#6032) · 0e4d6536
  Jeffrey Morgan authored Jul 28, 2024
  
  0e4d6536
23 Jul, 2024 1 commit
- Better explain multi-gpu behavior · 830fdd27
  Daniel Hiltgen authored Jul 23, 2024
  
  830fdd27
10 Jul, 2024 1 commit

Bump ROCm on windows to 6.1.2 · 1f50356e

Daniel Hiltgen authored Jul 10, 2024

This also adjusts our algorithm to favor our bundled ROCm.
I've confirmed VRAM reporting still doesn't work properly so we
can't yet enable concurrency by default.

1f50356e

02 Jul, 2024 1 commit
- Add windows radeon concurreny note · 69c04eec
  Daniel Hiltgen authored Jul 02, 2024
  
  69c04eec
28 Jun, 2024 1 commit
- Document concurrent behavior and settings · aae56abb
  Daniel Hiltgen authored Jun 28, 2024
  
  aae56abb
21 May, 2024 1 commit
- doc updates for the faq/troubleshooting (#4565) · 3bade04e
  Patrick Devine authored May 21, 2024
  
  3bade04e
14 May, 2024 1 commit
- update the FAQ to be more clear about windows env variables (#4415) · f1548ef6
  Patrick Devine authored May 13, 2024
  
  f1548ef6
06 May, 2024 1 commit
- Windows automatically recognizes username (#3214) · d091fe3c
  Jeffrey Chen authored May 07, 2024
  
  d091fe3c
05 May, 2024 1 commit
- Make maximum pending request configurable · 20f6c065
  Daniel Hiltgen authored May 03, 2024
```
This also bumps up the default to be 50 queued requests
instead of 10.
```
  20f6c065
03 May, 2024 1 commit

Update 'llama2' -> 'llama3' in most places (#4116) · e8aaea03

Dr Nic Williams authored May 04, 2024



* Update 'llama2' -> 'llama3' in most places

---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>

e8aaea03

24 Apr, 2024 1 commit
- add OLLAMA_KEEP_ALIVE env variable to FAQ (#3865) · 74d2a9ef
  Patrick Devine authored Apr 23, 2024
  
  74d2a9ef
26 Mar, 2024 1 commit
- change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) · 1b272d5b
  Patrick Devine authored Mar 26, 2024
  
  1b272d5b
21 Mar, 2024 2 commits
- Add docs for GPU selection and nvidia uvm workaround · d8fdbfd8
  Daniel Hiltgen authored Mar 21, 2024
  
  d8fdbfd8
- doc: faq gpu compatibility (#3142) · a5ba0fcf
  Bruce MacDonald authored Mar 21, 2024
  
  a5ba0fcf
20 Mar, 2024 1 commit
- Update faq.md · 3a30bf56
  Jeffrey Morgan authored Mar 20, 2024
  
  3a30bf56
18 Mar, 2024 2 commits
- Update faq.md · 7ed3e941
  Jeffrey Morgan authored Mar 18, 2024
  
  7ed3e941
- update `faq.md` · 2297ad39
  jmorganca authored Mar 16, 2024
  
  2297ad39
12 Mar, 2024 1 commit
- Add docs explaining GPU selection env vars · b53229a2
  Daniel Hiltgen authored Mar 11, 2024
  
  b53229a2
21 Feb, 2024 1 commit
- Update faq.md · f0425d3d
  Jeffrey Morgan authored Feb 20, 2024
  
  f0425d3d
20 Feb, 2024 2 commits
- Update faq.md · df56f1ee
  Jeffrey Morgan authored Feb 19, 2024
  
  df56f1ee
- Update faq.md · 41aca5c2
  Jeffrey Morgan authored Feb 19, 2024
  
  41aca5c2
19 Feb, 2024 2 commits
- add faqs for memory pre-loading and the keep_alive setting (#2601) · 9a7a4b95
  Patrick Devine authored Feb 19, 2024
  
  9a7a4b95
- Document setting server vars for windows · b338c063
  Daniel Hiltgen authored Feb 19, 2024
  
  b338c063
16 Feb, 2024 1 commit
- Update faq.md with the location of models on Windows (#2545) · 97746630
  Tristan Rhodes authored Feb 16, 2024
  
  97746630
22 Jan, 2024 1 commit
- faq: update to use launchctl setenv · 93a75626
  Michael Yang authored Jan 22, 2024
  
  93a75626
22 Dec, 2023 1 commit
- update where are models stored q · 511069a2
  Matt Williams authored Dec 22, 2023
```
Signed-off-by: Matt Williams <m@technovangelist.com>
```
  511069a2