启动Qwen3.5-122B-A10B-GPTQ-Int4报错
启动命令:vllm serve /root/Qwen3.5-122B-A10B-GPTQ-Int4 --gpu-memory-utilization 0.95 --served-model-name qwen3.5-122b --host 0.0.0.0 --port 8001 --tensor-parallel-size 4 --max-model-len 32768 --dtype float16 --quantization gptq --enable-auto-tool-choice --tool-call-parser qwen3_coder --reasoning-parser qwen3 --default-chat-template-kwargs '{"enable_thinking": false}' 报错如下: ERROR 04-11 18:54:40 [multiproc_executor.py:246] Worker proc VllmWorker-2 died unexpectedly, shutting down executor. Process EngineCore_DP0: Traceback (most recent call last): File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python3.10/dist-packages/vllm/v1/engine/core.py", line 950, in run_engine_core raise e File "/usr/local/lib/python3.10/dist-packages/vllm/v1/engine/core.py", line 937, in run_engine_core engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs) File "/usr/local/lib/python3.10/dist-packages/vllm/v1/engine/core.py", line 691, in init super().init( File "/usr/local/lib/python3.10/dist-packages/vllm/v1/engine/core.py", line 112, in init num_gpu_blocks, num_cpu_blocks, kv_cache_config = self._initialize_kv_caches( File "/usr/local/lib/python3.10/dist-packages/vllm/v1/engine/core.py", line 242, in _initialize_kv_caches available_gpu_memory = self.model_executor.determine_available_memory() File "/usr/local/lib/python3.10/dist-packages/vllm/v1/executor/abstract.py", line 126, in determine_available_memory return self.collective_rpc("determine_available_memory") File "/usr/local/lib/python3.10/dist-packages/vllm/v1/executor/multiproc_executor.py", line 374, in collective_rpc return aggregate(get_response()) File "/usr/local/lib/python3.10/dist-packages/vllm/v1/executor/multiproc_executor.py", line 357, in get_response raise RuntimeError( RuntimeError: Worker failed with error 'name 'get_moe_triton_config_w4a16' is not defined', please check the stack trace above for the root cause