Unverified Commit 4ba88757 authored by Wentao Ye's avatar Wentao Ye Committed by GitHub
Browse files

[Bug] Fix Test in Batch Invariant (#26128)


Signed-off-by: default avataryewentao256 <zhyanwentao@126.com>
parent 6273fe8d
...@@ -292,8 +292,11 @@ def LLM_with_max_seqs( ...@@ -292,8 +292,11 @@ def LLM_with_max_seqs(
# Allow some CPU offload if needed. # Allow some CPU offload if needed.
swap_space=swap_space, swap_space=swap_space,
# Keep things lean and CI-friendly. # Keep things lean and CI-friendly.
dtype="float16", dtype="auto",
# Single-GPU by default; override externally if desired. # Single-GPU by default; override externally if desired.
tensor_parallel_size=int(os.getenv("VLLM_TP_SIZE", "1")), tensor_parallel_size=int(os.getenv("VLLM_TP_SIZE", "1")),
trust_remote_code=os.getenv("VLLM_TRUST_REMOTE_CODE", "0") == "1", trust_remote_code=os.getenv("VLLM_TRUST_REMOTE_CODE", "0") == "1",
enable_prefix_caching=False,
# Enable for MOE models
# enable_expert_parallel=True,
) )
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment