1. 17 Nov, 2024 2 commits
  2. 16 Nov, 2024 1 commit
  3. 15 Nov, 2024 3 commits
    • Jesse Gross's avatar
      runner.go: Propagate panics back to the user. · d875e99e
      Jesse Gross authored
      This is a partial revert of 8a35bb92
      "runner.go: Increase survivability of main processing loop", removing
      the panic handler.
      
      Although we want to avoid errors taking down the runner, we also
      should make the user aware of problems when they happen. In the
      future, we can restructure things so both parts are true.
      d875e99e
    • Jesse Gross's avatar
      runner.go: Increase survivability of main processing loop · 8a35bb92
      Jesse Gross authored
      Currently, if an error occurs during the prep stages (such as
      tokenizing) of a single request, it will only affect that request.
      However, if an error happens during decoding, it can take down the
      entire runner.
      
      Instead, it's better to drop the tokens that triggered the error and try to
      keep going. However, we also need to stop when we run out of tokens,
      otherwise, this just causes an infinite loop. This is likely the cause
      of at least some of the hanging issues that have been reported.
      
      Bug #7573
      8a35bb92
    • Daniel Hiltgen's avatar
      build: fix arm container image (#7674) · a0ea067b
      Daniel Hiltgen authored
      Fix a rebase glitch from the old C++ runner build model
      a0ea067b
  4. 14 Nov, 2024 7 commits
  5. 12 Nov, 2024 8 commits
  6. 11 Nov, 2024 4 commits
  7. 10 Nov, 2024 1 commit
  8. 08 Nov, 2024 3 commits
  9. 07 Nov, 2024 5 commits
  10. 06 Nov, 2024 3 commits
  11. 05 Nov, 2024 3 commits
    • RAPID ARCHITECT's avatar
      Update README.md (#7516) · 9d71bcc3
      RAPID ARCHITECT authored
      added reddit rate below hexabot, ollama powered reddit search and analysis with streamlit for the intervace
      9d71bcc3
    • Daniel Hiltgen's avatar
      One corrupt manifest should not wedge model operations (#7515) · a4c70fe1
      Daniel Hiltgen authored
      One potential failure mode is an empty file which bubbles up as an EOF error,
      leading to all pulls and listing operations failing.  Instead, continue and
      warn about the corrupt manifest.  This also allows re-pulling the corrupt
      manifest to repair the system.
      a4c70fe1
    • Jesse Gross's avatar
      prompt: Use a single token when estimating mllama context size · 34a75102
      Jesse Gross authored
      Currently we assume that images take 768 tokens of context size for
      the purposes of clipping old messages that exceed the context window.
      However, our mllama implementation stores the full image embedding
      in a single token. As a result, there is significant waste of context
      space.
      
      Ideally, we would handle this more generically and have the
      implementation report the number of tokens. However, at the moment
      this would just result in a similar set of 'if' conditions in the
      runner plus APIs to report it back. So for now, we just keep this
      simple.
      34a75102