[fix]修复开启并行解码后,在极端测试情况下,由于设置了speculative-disable-by-batch-size导致不跑并行解码导致previo...
[fix]修复开启并行解码后,在极端测试情况下,由于设置了speculative-disable-by-batch-size导致不跑并行解码导致previous_hidden_states不断增加,最终导致显存用尽服务无响应问题
Showing
Please register or sign in to comment