Merge branch 'v0.15.1-dev-reduced-topk-topp' into 'v0.15.1-dev'
feat(sampler): 增加 reduced topk+topp 采样快速路径以降低全词表 softmax 开销 See merge request dcutoolkit/deeplearing/vllm!447
Showing
Please register or sign in to comment
feat(sampler): 增加 reduced topk+topp 采样快速路径以降低全词表 softmax 开销 See merge request dcutoolkit/deeplearing/vllm!447