Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
14b7899d
Unverified
Commit
14b7899d
authored
Feb 12, 2025
by
Michael Goin
Committed by
GitHub
Feb 12, 2025
Browse files
[CI] Fix failing FP8 cpu offload test (#13170)
Signed-off-by:
mgoin
<
mgoin64@gmail.com
>
parent
09972e71
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
6 deletions
+6
-6
tests/quantization/test_cpu_offload.py
tests/quantization/test_cpu_offload.py
+6
-6
No files found.
tests/quantization/test_cpu_offload.py
View file @
14b7899d
...
...
@@ -14,13 +14,13 @@ from ..utils import compare_two_settings
reason
=
"fp8 is not supported on this GPU type."
)
def
test_cpu_offload_fp8
():
# Test quantization of an unquantized checkpoint
compare_two_settings
(
"meta-llama/
Meta-
Llama-3
-8
B-Instruct"
,
compare_two_settings
(
"meta-llama/Llama-3
.2-1
B-Instruct"
,
[
"--quantization"
,
"fp8"
],
[
"--quantization"
,
"fp8"
,
"--cpu-offload-gb"
,
"
2
"
],
[
"--quantization"
,
"fp8"
,
"--cpu-offload-gb"
,
"
1
"
],
max_wait_seconds
=
480
)
# Test loading a quantized checkpoint
compare_two_settings
(
"neuralmagic/
Meta-Llama-3-8
B-Instruct-FP8"
,
[],
[
"--cpu-offload-gb"
,
"
2
"
],
compare_two_settings
(
"neuralmagic/
Qwen2-1.5
B-Instruct-FP8"
,
[],
[
"--cpu-offload-gb"
,
"
1
"
],
max_wait_seconds
=
480
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment