Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
c32ab8be
Unverified
Commit
c32ab8be
authored
Jul 30, 2024
by
Cade Daniel
Committed by
GitHub
Jul 31, 2024
Browse files
[Speculative decoding] Add serving benchmark for llama3 70b + speculative decoding (#6964)
parent
fb4f530b
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
22 additions
and
1 deletion
+22
-1
.buildkite/nightly-benchmarks/tests/serving-tests.json
.buildkite/nightly-benchmarks/tests/serving-tests.json
+22
-1
No files found.
.buildkite/nightly-benchmarks/tests/serving-tests.json
View file @
c32ab8be
...
@@ -55,5 +55,26 @@
...
@@ -55,5 +55,26 @@
"dataset_path"
:
"./ShareGPT_V3_unfiltered_cleaned_split.json"
,
"dataset_path"
:
"./ShareGPT_V3_unfiltered_cleaned_split.json"
,
"num_prompts"
:
200
"num_prompts"
:
200
}
}
},
{
"test_name"
:
"serving_llama70B_tp4_sharegpt_specdecode"
,
"qps_list"
:
[
2
],
"server_parameters"
:
{
"model"
:
"meta-llama/Meta-Llama-3-70B-Instruct"
,
"disable_log_requests"
:
""
,
"tensor_parallel_size"
:
4
,
"swap_space"
:
16
,
"speculative_model"
:
"turboderp/Qwama-0.5B-Instruct"
,
"num_speculative_tokens"
:
4
,
"speculative_draft_tensor_parallel_size"
:
1
,
"use_v2_block_manager"
:
""
},
"client_parameters"
:
{
"model"
:
"meta-llama/Meta-Llama-3-70B-Instruct"
,
"backend"
:
"vllm"
,
"dataset_name"
:
"sharegpt"
,
"dataset_path"
:
"./ShareGPT_V3_unfiltered_cleaned_split.json"
,
"num_prompts"
:
200
}
}
}
]
]
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment