"docs/source/quantization/fp8_e4m3_kvcache.rst" did not exist on "3dcb3e8b9838cbbef83ce326b1a35b31a3cf14f2"
-
zhuwenwen authored
[fix]修复开启并行解码后,在极端测试情况下,由于设置了speculative-disable-by-batch-size导致不跑并行解码导致previous_hidden_states不断增加,最终导致显存用尽服务无响应问题
58fc3e31