Unverified Commit 7151ae65 authored by yanghui1-arch's avatar yanghui1-arch Committed by GitHub
Browse files

[Bugfix] RoBERTa position_id accumulation in CUDA graph padding region (#37873)


Signed-off-by: default avatardass90 <3053034939@qq.com>
parent 45bd5c8e
...@@ -3084,6 +3084,8 @@ class GPUModelRunner( ...@@ -3084,6 +3084,8 @@ class GPUModelRunner(
positions = self.xdrope_positions.gpu[:, :num_input_tokens] positions = self.xdrope_positions.gpu[:, :num_input_tokens]
else: else:
positions = self.positions.gpu[:num_input_tokens] positions = self.positions.gpu[:num_input_tokens]
if num_input_tokens > num_scheduled_tokens:
self.positions.gpu[num_scheduled_tokens:num_input_tokens].zero_()
if is_first_rank: if is_first_rank:
intermediate_tensors = None intermediate_tensors = None
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment