test_prompt.txt · 216a63b858d04d8734352c5aba27f16e34489165 · OpenDAS / ktransformers

Atream authored Feb 22, 2025

use marlin for lm_head, lm_head only calc last token for prefill
extend context window to 19K for DeepSeek-V3/R1 within 24GB VRAM

5ec33d04

test_prompt.txt 75.5 KB