Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
c882a7f5
Unverified
Commit
c882a7f5
authored
Jul 24, 2024
by
Nick Hill
Committed by
GitHub
Jul 24, 2024
Browse files
[SpecDecoding] Update MLPSpeculator CI tests to use smaller model (#6714)
parent
5e8ca973
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
3 deletions
+3
-3
tests/spec_decode/e2e/test_mlp_correctness.py
tests/spec_decode/e2e/test_mlp_correctness.py
+3
-3
No files found.
tests/spec_decode/e2e/test_mlp_correctness.py
View file @
c882a7f5
...
@@ -24,14 +24,14 @@ import pytest
...
@@ -24,14 +24,14 @@ import pytest
from
.conftest
import
run_greedy_equality_correctness_test
from
.conftest
import
run_greedy_equality_correctness_test
# main model
# main model
MAIN_MODEL
=
"
ibm-granite/granite-3b-code-instruct
"
MAIN_MODEL
=
"
JackFram/llama-160m
"
# speculative model
# speculative model
SPEC_MODEL
=
"ibm-
granite/granite-3b-code-instruct
-accelerator"
SPEC_MODEL
=
"ibm-
fms/llama-160m
-accelerator"
# max. number of speculative tokens: this corresponds to
# max. number of speculative tokens: this corresponds to
# n_predict in the config.json of the speculator model.
# n_predict in the config.json of the speculator model.
MAX_SPEC_TOKENS
=
5
MAX_SPEC_TOKENS
=
3
# precision
# precision
PRECISION
=
"float32"
PRECISION
=
"float32"
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment