Unverified Commit ca8d02ab authored by Stefan He's avatar Stefan He Committed by GitHub
Browse files

FA3 Spec Decoding to support top k = 1 and add cuda graph support (#5050)


Co-authored-by: default avatarQingquan Song <ustcsqq@gmail.com>
Co-authored-by: default avatarChunan Zeng <zcnrex@gmail.com>
parent 3f287b85
...@@ -104,6 +104,9 @@ class ForwardMode(IntEnum): ...@@ -104,6 +104,9 @@ class ForwardMode(IntEnum):
or self == ForwardMode.IDLE or self == ForwardMode.IDLE
) )
def is_extend_or_draft_extend(self):
return self == ForwardMode.EXTEND or self == ForwardMode.DRAFT_EXTEND
def is_dummy_first(self): def is_dummy_first(self):
return self == ForwardMode.DUMMY_FIRST return self == ForwardMode.DUMMY_FIRST
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment