Commit c0697921 authored by zhuwenwen's avatar zhuwenwen
Browse files

Merge branch 'v0.9.2-dev-fix-zero' into 'v0.9.2-dev'

fix: 解决原版0消耗chunk-prefill崩溃问题

See merge request dcutoolkit/deeplearing/vllm!393
parents 9c95f8b0 2b1be0e8
......@@ -796,6 +796,7 @@ class V1ZeroModelRunner(GPUModelRunner):
req_state = self.requests[req_id]
token_idx = self.last_sampled_token_lens[req_idx]
if token_idx == -1:
self.fix_sampled_token_ids[req_idx].clear()
continue
fix_len = len(self.fix_sampled_token_ids[req_idx])
req_state.output_token_ids[token_idx:token_idx + fix_len] = self.fix_sampled_token_ids[req_idx]
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment