Unverified Commit edd270bc authored by Peter Pan's avatar Peter Pan Committed by GitHub
Browse files

[Bugfix] Prevent IndexError for cached requests when pipeline parallelism is disabled (#20486)


Signed-off-by: default avatarPeter Pan <Peter.Pan@daocloud.io>
parent 110df743
......@@ -635,6 +635,8 @@ class Scheduler(SchedulerInterface):
token_ids = req.all_token_ids[req.num_computed_tokens:req.
num_computed_tokens + num_tokens]
new_token_ids.append(token_ids)
else:
new_token_ids.append([])
new_block_ids.append(req_to_new_block_ids[req_id])
num_computed_tokens.append(req.num_computed_tokens)
# Because resumed_reqs is usually empty, it is more efficient to do
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment