Unverified Commit efb099cd authored by Liangsheng Yin's avatar Liangsheng Yin Committed by GitHub
Browse files

Fix prefill oom (#1743)

parent 09603c6d
......@@ -427,7 +427,7 @@ class Scheduler:
if req.sampling_params.max_new_tokens is not None
else 1 << 30
),
self.max_req_input_len - 1 - len(req.origin_input_ids),
self.max_req_input_len - len(req.origin_input_ids),
)
self.waiting_queue.append(req)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment