1. 06 Sep, 2023 1 commit
  2. 05 Sep, 2023 2 commits
    • Michael Yang's avatar
      fix parameter inheritence · 06ef90c0
      Michael Yang authored
      parameters are not inherited because they are processed differently from
      other layer. fix this by explicitly merging the inherited params into
      the new params. parameter values defined in the new modelfile will
      override those defined in the inherited modelfile. array lists are
      replaced instead of appended
      06ef90c0
    • Michael Yang's avatar
      use slices.DeleteFunc · e9f6df7d
      Michael Yang authored
      e9f6df7d
  3. 03 Sep, 2023 1 commit
  4. 01 Sep, 2023 1 commit
    • Quinn Slack's avatar
      do not HTML-escape prompt · 62d29b21
      Quinn Slack authored
      The `html/template` package automatically HTML-escapes interpolated strings in templates. This behavior is undesirable because it causes prompts like `<h1>hello` to be escaped to `&lt;h1&gt;hello` before being passed to the LLM.
      
      The included test case passes, but before the code change, it failed:
      
      ```
      --- FAIL: TestModelPrompt
          images_test.go:21: got "a&lt;h1&gt;b", want "a<h1>b"
      ```
      62d29b21
  5. 31 Aug, 2023 4 commits
  6. 30 Aug, 2023 2 commits
    • Bruce MacDonald's avatar
      subprocess llama.cpp server (#401) · 42998d79
      Bruce MacDonald authored
      * remove c code
      * pack llama.cpp
      * use request context for llama_cpp
      * let llama_cpp decide the number of threads to use
      * stop llama runner when app stops
      * remove sample count and duration metrics
      * use go generate to get libraries
      * tmp dir for running llm
      42998d79
    • Quinn Slack's avatar
      treat stop as stop sequences, not exact tokens (#442) · f4432e1d
      Quinn Slack authored
      The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list.
      
      Fixes https://github.com/jmorganca/ollama/issues/295.
      f4432e1d
  7. 29 Aug, 2023 1 commit
  8. 28 Aug, 2023 4 commits
  9. 26 Aug, 2023 1 commit
  10. 22 Aug, 2023 7 commits
  11. 18 Aug, 2023 2 commits
    • Michael Yang's avatar
      retry on unauthorized chunk push · 3b49315f
      Michael Yang authored
      The token printed for authorized requests has a lifetime of 1h. If an
      upload exceeds 1h, a chunk push will fail since the token is created on
      a "start upload" request.
      
      This replaces the Pipe with SectionReader which is simpler and
      implements Seek, a requirement for makeRequestWithRetry. This is
      slightly worse than using a Pipe since the progress update is directly
      tied to the chunk size instead of controlled separately.
      3b49315f
    • Michael Yang's avatar
      copy metadata from source · 7eda70f2
      Michael Yang authored
      7eda70f2
  12. 17 Aug, 2023 4 commits
  13. 16 Aug, 2023 3 commits
  14. 15 Aug, 2023 3 commits
  15. 14 Aug, 2023 4 commits