Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
e7c1b7f3
"vscode:/vscode.git/clone" did not exist on "96e06e3cb73f933bf26ff74599fd96d38c50805c"
Commit
e7c1b7f3
authored
Sep 06, 2024
by
zhuwenwen
Browse files
Merge branch 'v0.5.4-dtk24.04.1'
parents
7462218e
04c62b93
Changes
721
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
199 additions
and
19 deletions
+199
-19
.buildkite/check-wheel-size.py
.buildkite/check-wheel-size.py
+1
-1
.buildkite/download-images.sh
.buildkite/download-images.sh
+0
-18
.buildkite/lm-eval-harness/configs/DeepSeek-V2-Lite-Chat.yaml
...ldkite/lm-eval-harness/configs/DeepSeek-V2-Lite-Chat.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Meta-Llama-3-70B-Instruct-FBGEMM-nonuniform.yaml
.../configs/Meta-Llama-3-70B-Instruct-FBGEMM-nonuniform.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Meta-Llama-3-70B-Instruct.yaml
...te/lm-eval-harness/configs/Meta-Llama-3-70B-Instruct.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-Channelwise-compressed-tensors.yaml
...a-Llama-3-8B-Instruct-Channelwise-compressed-tensors.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-FBGEMM-nonuniform.yaml
...s/configs/Meta-Llama-3-8B-Instruct-FBGEMM-nonuniform.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-FP8-compressed-tensors.yaml
...figs/Meta-Llama-3-8B-Instruct-FP8-compressed-tensors.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-FP8.yaml
...lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-FP8.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-INT8-compressed-tensors.yaml
...igs/Meta-Llama-3-8B-Instruct-INT8-compressed-tensors.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-nonuniform-compressed-tensors.yaml
...ta-Llama-3-8B-Instruct-nonuniform-compressed-tensors.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct.yaml
...ite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-QQQ.yaml
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-QQQ.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Minitron-4B-Base.yaml
.buildkite/lm-eval-harness/configs/Minitron-4B-Base.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Mixtral-8x22B-Instruct-v0.1-FP8-Dynamic.yaml
...ness/configs/Mixtral-8x22B-Instruct-v0.1-FP8-Dynamic.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Mixtral-8x7B-Instruct-v0.1-FP8.yaml
...-eval-harness/configs/Mixtral-8x7B-Instruct-v0.1-FP8.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Mixtral-8x7B-Instruct-v0.1.yaml
...e/lm-eval-harness/configs/Mixtral-8x7B-Instruct-v0.1.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Qwen2-1.5B-Instruct-FP8W8.yaml
...te/lm-eval-harness/configs/Qwen2-1.5B-Instruct-FP8W8.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Qwen2-1.5B-Instruct-INT8-compressed-tensors.yaml
.../configs/Qwen2-1.5B-Instruct-INT8-compressed-tensors.yaml
+11
-0
.buildkite/lm-eval-harness/configs/Qwen2-1.5B-Instruct-W8A16-compressed-tensors.yaml
...configs/Qwen2-1.5B-Instruct-W8A16-compressed-tensors.yaml
+11
-0
No files found.
.buildkite/check-wheel-size.py
View file @
e7c1b7f3
import
os
import
zipfile
MAX_SIZE_MB
=
2
0
0
MAX_SIZE_MB
=
2
5
0
def
print_top_10_largest_files
(
zip_file
):
...
...
.buildkite/download-images.sh
deleted
100644 → 0
View file @
7462218e
#!/bin/bash
set
-ex
set
-o
pipefail
(
which wget
&&
which curl
)
||
(
apt-get update
&&
apt-get
install
-y
wget curl
)
# aws s3 sync s3://air-example-data-2/vllm_opensource_llava/ images/
mkdir
-p
images
cd
images
wget https://air-example-data-2.s3.us-west-2.amazonaws.com/vllm_opensource_llava/stop_sign_pixel_values.pt
wget https://air-example-data-2.s3.us-west-2.amazonaws.com/vllm_opensource_llava/stop_sign_image_features.pt
wget https://air-example-data-2.s3.us-west-2.amazonaws.com/vllm_opensource_llava/cherry_blossom_pixel_values.pt
wget https://air-example-data-2.s3.us-west-2.amazonaws.com/vllm_opensource_llava/cherry_blossom_image_features.pt
wget https://air-example-data-2.s3.us-west-2.amazonaws.com/vllm_opensource_llava/stop_sign.jpg
wget https://air-example-data-2.s3.us-west-2.amazonaws.com/vllm_opensource_llava/cherry_blossom.jpg
cd
-
.buildkite/lm-eval-harness/configs/DeepSeek-V2-Lite-Chat.yaml
0 → 100644
View file @
e7c1b7f3
# bash ./run-lm-eval-gsm-vllm-baseline.sh -m deepseek-ai/DeepSeek-V2-Lite-Chat -b "auto" -l 1000 -f 5 -t 2
model_name
:
"
deepseek-ai/DeepSeek-V2-Lite-Chat"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.671
-
name
:
"
exact_match,flexible-extract"
value
:
0.664
limit
:
1000
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Meta-Llama-3-70B-Instruct-FBGEMM-nonuniform.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-hf-baseline.sh -m nm-testing/Meta-Llama-3-70B-Instruct-FBGEMM-nonuniform -b auto -l 1000 -f 5
model_name
:
"
nm-testing/Meta-Llama-3-70B-Instruct-FBGEMM-nonuniform"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.905
-
name
:
"
exact_match,flexible-extract"
value
:
0.905
limit
:
1000
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Meta-Llama-3-70B-Instruct.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-hf-baseline.sh -m meta-llama/Meta-Llama-3-70B-Instruct -b 32 -l 250 -f 5
model_name
:
"
meta-llama/Meta-Llama-3-70B-Instruct"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.892
-
name
:
"
exact_match,flexible-extract"
value
:
0.892
limit
:
250
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-Channelwise-compressed-tensors.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh -m nm-testing/Meta-Llama-3-8B-Instruct-W8A8-FP8-Channelwise-compressed-tensors -b auto -l 1000 -f 5 -t 1
model_name
:
"
nm-testing/Meta-Llama-3-8B-Instruct-W8A8-FP8-Channelwise-compressed-tensors"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.752
-
name
:
"
exact_match,flexible-extract"
value
:
0.754
limit
:
1000
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-FBGEMM-nonuniform.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh -m nm-testing/Meta-Llama-3-8B-Instruct-FBGEMM-nonuniform -b auto -l 1000 -f 5 -t 1
model_name
:
"
nm-testing/Meta-Llama-3-8B-Instruct-FBGEMM-nonuniform"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.753
-
name
:
"
exact_match,flexible-extract"
value
:
0.753
limit
:
1000
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-FP8-compressed-tensors.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh -m nm-testing/Meta-Llama-3-8B-FP8-compressed-tensors-test -b 32 -l 1000 -f 5 -t 1
model_name
:
"
nm-testing/Meta-Llama-3-8B-FP8-compressed-tensors-test"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.755
-
name
:
"
exact_match,flexible-extract"
value
:
0.755
limit
:
1000
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-FP8.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh -m neuralmagic/Meta-Llama-3-8B-Instruct-FP8 -b 32 -l 250 -f 5 -t 1
model_name
:
"
neuralmagic/Meta-Llama-3-8B-Instruct-FP8"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.753
-
name
:
"
exact_match,flexible-extract"
value
:
0.753
limit
:
1000
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-INT8-compressed-tensors.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh -m nm-testing/Meta-Llama-3-8B-Instruct-W8-Channel-A8-Dynamic-Per-Token-Test -b "auto" -l 250 -f 5 -t 1
model_name
:
"
nm-testing/Meta-Llama-3-8B-Instruct-W8-Channel-A8-Dynamic-Per-Token-Test"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.728
-
name
:
"
exact_match,flexible-extract"
value
:
0.728
limit
:
250
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct-nonuniform-compressed-tensors.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh -m nm-testing/Meta-Llama-3-8B-Instruct-nonuniform-test -b auto -l 1000 -f 5 -t 1
model_name
:
"
nm-testing/Meta-Llama-3-8B-Instruct-nonuniform-test"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.758
-
name
:
"
exact_match,flexible-extract"
value
:
0.759
limit
:
1000
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-Instruct.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-hf-baseline.sh -m meta-llama/Meta-Llama-3-8B-Instruct -b 32 -l 250 -f 5 -t 1
model_name
:
"
meta-llama/Meta-Llama-3-8B-Instruct"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.756
-
name
:
"
exact_match,flexible-extract"
value
:
0.752
limit
:
250
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Meta-Llama-3-8B-QQQ.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh -m HandH1998/QQQ-Llama-3-8b-g128 -b 32 -l 1000 -f 5 -t 1
model_name
:
"
HandH1998/QQQ-Llama-3-8b-g128"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.409
-
name
:
"
exact_match,flexible-extract"
value
:
0.406
limit
:
1000
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Minitron-4B-Base.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh -m nvidia/Minitron-4B-Base -b auto -l 1000 -f 5 -t 1
model_name
:
"
nvidia/Minitron-4B-Base"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.252
-
name
:
"
exact_match,flexible-extract"
value
:
0.252
limit
:
1000
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Mixtral-8x22B-Instruct-v0.1-FP8-Dynamic.yaml
0 → 100644
View file @
e7c1b7f3
# bash ./run-lm-eval-gsm-vllm-baseline.sh -m neuralmagic/Mixtral-8x22B-Instruct-v0.1-FP8-dynamic -b "auto" -l 250 -f 5 -t 8
model_name
:
"
neuralmagic/Mixtral-8x22B-Instruct-v0.1-FP8-dynamic"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.86
-
name
:
"
exact_match,flexible-extract"
value
:
0.86
limit
:
250
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Mixtral-8x7B-Instruct-v0.1-FP8.yaml
0 → 100644
View file @
e7c1b7f3
# bash ./run-lm-eval-gsm-vllm-baseline.sh -m neuralmagic/Mixtral-8x7B-Instruct-v0.1-FP8 -b "auto" -l 250 -f 5 -t 4
model_name
:
"
neuralmagic/Mixtral-8x7B-Instruct-v0.1-FP8"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.624
-
name
:
"
exact_match,flexible-extract"
value
:
0.624
limit
:
250
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Mixtral-8x7B-Instruct-v0.1.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-hf-baseline.sh -m neuralmagic/Mixtral-8x7B-Instruct-v0.1 -b 32 -l 250 -f 5 -t 4
model_name
:
"
mistralai/Mixtral-8x7B-Instruct-v0.1"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.616
-
name
:
"
exact_match,flexible-extract"
value
:
0.632
limit
:
250
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Qwen2-1.5B-Instruct-FP8W8.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh -m nm-testing/Qwen2-1.5B-Instruct-FP8W8 -b auto -l 1000 -f 5 -t 1
model_name
:
"
nm-testing/Qwen2-1.5B-Instruct-FP8W8"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.578
-
name
:
"
exact_match,flexible-extract"
value
:
0.585
limit
:
1000
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Qwen2-1.5B-Instruct-INT8-compressed-tensors.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh -m neuralmagic/Qwen2-1.5B-Instruct-quantized.w8a8 -b "auto" -l 1000 -f 5 -t 1
model_name
:
"
neuralmagic/Qwen2-1.5B-Instruct-quantized.w8a8"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.593
-
name
:
"
exact_match,flexible-extract"
value
:
0.588
limit
:
1000
num_fewshot
:
5
.buildkite/lm-eval-harness/configs/Qwen2-1.5B-Instruct-W8A16-compressed-tensors.yaml
0 → 100644
View file @
e7c1b7f3
# bash .buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh -m nm-testing/Qwen2-1.5B-Instruct-W8A16-Channelwise -b "auto" -l 1000 -f 5 -t 1
model_name
:
"
nm-testing/Qwen2-1.5B-Instruct-W8A16-Channelwise"
tasks
:
-
name
:
"
gsm8k"
metrics
:
-
name
:
"
exact_match,strict-match"
value
:
0.595
-
name
:
"
exact_match,flexible-extract"
value
:
0.582
limit
:
1000
num_fewshot
:
5
Prev
1
2
3
4
5
…
37
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment