Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
c8627cd4
Unverified
Commit
c8627cd4
authored
Oct 09, 2024
by
youkaichao
Committed by
GitHub
Oct 09, 2024
Browse files
[ci][test] use load dummy for testing (#9165)
parent
8bfaa4e3
Changes
3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
20 additions
and
1 deletion
+20
-1
.buildkite/test-pipeline.yaml
.buildkite/test-pipeline.yaml
+1
-1
tests/utils.py
tests/utils.py
+17
-0
vllm/envs.py
vllm/envs.py
+2
-0
No files found.
.buildkite/test-pipeline.yaml
View file @
c8627cd4
...
...
@@ -269,7 +269,7 @@ steps:
-
csrc/
-
vllm/model_executor/layers/quantization
-
tests/quantization
command
:
pytest -v -s quantization
command
:
VLLM_TEST_FORCE_LOAD_FORMAT=auto
pytest -v -s quantization
-
label
:
LM Eval Small Models
# 53min
working_dir
:
"
/vllm-workspace/.buildkite/lm-eval-harness"
...
...
tests/utils.py
View file @
c8627cd4
...
...
@@ -16,6 +16,7 @@ import requests
from
openai.types.completion
import
Completion
from
typing_extensions
import
ParamSpec
,
assert_never
import
vllm.envs
as
envs
from
tests.models.utils
import
TextTextLogprobs
from
vllm.distributed
import
(
ensure_model_parallel_initialized
,
init_distributed_environment
)
...
...
@@ -352,10 +353,26 @@ def compare_all_settings(model: str,
tokenizer_mode
=
tokenizer_mode
,
)
can_force_load_format
=
True
for
args
in
all_args
:
if
"--load-format"
in
args
:
can_force_load_format
=
False
break
prompt
=
"Hello, my name is"
token_ids
=
tokenizer
(
prompt
).
input_ids
ref_results
:
List
=
[]
for
i
,
(
args
,
env
)
in
enumerate
(
zip
(
all_args
,
all_envs
)):
if
can_force_load_format
:
# we are comparing the results and
# usually we don't need real weights.
# we force to use dummy weights by default,
# and it should work for most of the cases.
# if not, we can use VLLM_TEST_FORCE_LOAD_FORMAT
# environment variable to force the load format,
# e.g. in quantization tests.
args
=
args
+
[
"--load-format"
,
envs
.
VLLM_TEST_FORCE_LOAD_FORMAT
]
compare_results
:
List
=
[]
results
=
ref_results
if
i
==
0
else
compare_results
with
RemoteOpenAIServer
(
model
,
...
...
vllm/envs.py
View file @
c8627cd4
...
...
@@ -397,6 +397,8 @@ environment_variables: Dict[str, Callable[[], Any]] = {
lambda
:
(
os
.
environ
.
get
(
"VLLM_TEST_FORCE_FP8_MARLIN"
,
"0"
).
strip
().
lower
()
in
(
"1"
,
"true"
)),
"VLLM_TEST_FORCE_LOAD_FORMAT"
:
lambda
:
os
.
getenv
(
"VLLM_TEST_FORCE_LOAD_FORMAT"
,
"dummy"
),
# Time in ms for the zmq client to wait for a response from the backend
# server for simple data operations
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment