Unverified Commit e4d61442 authored by Nick Hill's avatar Nick Hill Committed by GitHub
Browse files

[BugFix] Fix incremental detokenization perf issue (#16963)


Signed-off-by: default avatarNick Hill <nhill@redhat.com>
parent 8d32dc60
...@@ -161,7 +161,7 @@ class FastIncrementalDetokenizer(BaseIncrementalDetokenizer): ...@@ -161,7 +161,7 @@ class FastIncrementalDetokenizer(BaseIncrementalDetokenizer):
prompt_suffix = request.prompt_token_ids prompt_suffix = request.prompt_token_ids
prompt_len = len(prompt_suffix) prompt_len = len(prompt_suffix)
if prompt_len > 4: if prompt_len > 4:
for i in range(4, max(prompt_len + 1, 32)): for i in range(4, min(prompt_len + 1, 24)):
suffix = request.prompt_token_ids[-i:] suffix = request.prompt_token_ids[-i:]
if '�' not in self.tokenizer.decode(suffix): if '�' not in self.tokenizer.decode(suffix):
prompt_suffix = suffix prompt_suffix = suffix
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment