• Michael Yang's avatar
    fix: qwen25vl assign samebatch in multimodal input (#10789) · 69b2fe92
    Michael Yang authored
    setting samebatch on the vision start token is problematic because it
    will be shared with other inputs that also use images. this will cause
    the input to be cached and the runner will not see SameBatch. SameBatch
    will also be incorrect since it may be for a different image.
    
    assigning samebatch to the input tokens resolves this by ensure it's
    assigned correctly to inputs corresponding to the image.
    
    not setting same batch correctly may cause panics during inference since
    images are no longer guaranteed to be in the same batch.
    69b2fe92
model.go 4.46 KB