[V1] Fix Detokenizer loading in `AsyncLLM` (#10997)

Signed-off-by: Roger Wang <ywang@roblox.com>

[V1] Fix Detokenizer loading in `AsyncLLM` (#10997)
Signed-off-by: Roger Wang <ywang@roblox.com>
c6903579 · Roger Wang · GitHub · d1c2e15e · c6903579
Unverified Commit c6903579 authored Dec 09, 2024 by Roger Wang Committed by GitHub Dec 09, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 6 additions and 1 deletion

vllm/v1/engine/async_llm.py vllm/v1/engine/async_llm.py +6 -1

No files found.
--- a/vllm/v1/engine/async_llm.py
+++ b/vllm/v1/engine/async_llm.py
@@ -65,7 +65,12 @@ class AsyncLLM(EngineClient):
                                   input_registry)

        # Detokenizer (converts EngineCoreOutputs --> RequestOutput).
-        self.detokenizer = Detokenizer(vllm_config.model_config.tokenizer)
+        self.detokenizer = Detokenizer(
+            tokenizer_name=vllm_config.model_config.tokenizer,
+            tokenizer_mode=vllm_config.model_config.tokenizer_mode,
+            trust_remote_code=vllm_config.model_config.trust_remote_code,
+            revision=vllm_config.model_config.tokenizer_revision,
+        )

        # EngineCore (starts the engine in background process).
        self.engine_core = EngineCoreClient.make_client(