Unverified Commit 2a194ddd authored by Woosuk Kwon's avatar Woosuk Kwon Committed by GitHub
Browse files

[Model Runner V2] Add model_state inputs to CUDA graph capture (#36544)


Signed-off-by: default avatarWoosuk Kwon <woosuk@inferact.ai>
parent 203a7f27
......@@ -320,6 +320,7 @@ class ModelCudaGraphManager(CudaGraphManager):
model_inputs = {
"input_ids": input_buffers.input_ids[:num_tokens],
"positions": input_buffers.positions[:num_tokens],
**model_state.prepare_dummy_inputs(num_reqs, num_tokens),
}
model_output = model(**model_inputs)
if self.use_aux_hidden_state_outputs:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment