vllm_serve.sh 145 Bytes
Newer Older
zzg_666's avatar
zzg_666 committed
1
vllm serve PowerInfer/SmallThinker-3B-Preview   --trust-remote-code --dtype bfloat16 --max-seq-len-to-capture 32768 -tp 1   --max-model-len 32768