Commit 499ae731 authored by Jesse Gross's avatar Jesse Gross Committed by Jesse Gross
Browse files

ollamarunner: Base cached tokens on current prompt

When we restore a sequence from the cache, we split the prompt into
the already used tokens (stored in the cache) and new tokens that
need to be processed. Currently, the references to the used tokens
are coming from the stored previous sequence.

However, even though we know that the used tokens are semantically
equivalent to the prefix of the prompt, tokens can contain pointers
which are no longer valid. As a result, it is better to get the
used tokens from the prompt, which has currently valid pointers.

This doesn't currently have any impact because it isn't possible
to reuse the pointers (which are tensors) anyways. However, it
becomes an issue once we can.
parent ef202789
...@@ -104,8 +104,8 @@ func (c *InputCache) LoadCacheSlot(prompt []input, cachePrompt bool) (*InputCach ...@@ -104,8 +104,8 @@ func (c *InputCache) LoadCacheSlot(prompt []input, cachePrompt bool) (*InputCach
slog.Debug("loading cache slot", "id", slot.Id, "cache", len(slot.Inputs), "prompt", len(prompt), slog.Debug("loading cache slot", "id", slot.Id, "cache", len(slot.Inputs), "prompt", len(prompt),
"used", numPast, "remaining", len(prompt)-numPast) "used", numPast, "remaining", len(prompt)-numPast)
slot.Inputs = prompt[:numPast]
prompt = prompt[numPast:] prompt = prompt[numPast:]
slot.Inputs = slot.Inputs[:numPast]
return slot, prompt, nil return slot, prompt, nil
} }
......
...@@ -136,8 +136,8 @@ func (c *InputCache) LoadCacheSlot(prompt []input.Input) (*InputCacheSlot, []inp ...@@ -136,8 +136,8 @@ func (c *InputCache) LoadCacheSlot(prompt []input.Input) (*InputCacheSlot, []inp
slog.Debug("loading cache slot", "id", slot.Id, "cache", len(slot.Inputs), "prompt", len(prompt), slog.Debug("loading cache slot", "id", slot.Id, "cache", len(slot.Inputs), "prompt", len(prompt),
"used", numPast, "remaining", int32(len(prompt))-numPast) "used", numPast, "remaining", int32(len(prompt))-numPast)
slot.Inputs = prompt[:numPast]
prompt = prompt[numPast:] prompt = prompt[numPast:]
slot.Inputs = slot.Inputs[:numPast]
return slot, prompt, nil return slot, prompt, nil
} }
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment