• Jesse Gross's avatar
    ollamarunner: Check for minBatch of context space when shifting · bf24498b
    Jesse Gross authored
    Models can specify that a group of inputs need to be handled a single
    batch. However, context shifting didn't respect this and could trigger
    a break anyways. In this case, we should instead trigger a context
    shift earlier so that it occurs before the grouped batch.
    
    Note that there still some corner cases:
     - A long prompt that exceeds the context window can get truncated
       in the middle of an image. With the current models, this will
       result in the model not recognizing the image at all, which is
       pretty much the expected result with truncation.
     - The context window is set less than the minimum batch size. The
       only solution to this is to refuse to load the model with these
       settings. However, this can never occur with current models and
       default settings.
    
    Since users are unlikely to run into these scenarios, fixing them is
    left as a follow up.
    bf24498b
runner.go 21.1 KB