Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
40c27a7c
Unverified
Commit
40c27a7c
authored
Jul 30, 2024
by
Simon Mo
Committed by
GitHub
Jul 30, 2024
Browse files
[Build] Temporarily Disable Kernels and LoRA tests (#6961)
parent
6ca8031e
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
20 additions
and
20 deletions
+20
-20
.buildkite/test-pipeline.yaml
.buildkite/test-pipeline.yaml
+20
-20
No files found.
.buildkite/test-pipeline.yaml
View file @
40c27a7c
...
...
@@ -155,12 +155,12 @@ steps:
-
pytest -v -s test_inputs.py
-
pytest -v -s multimodal
-
label
:
Kernels Test %N
#mirror_hardwares: [amd]
commands
:
-
pip install https://github.com/flashinfer-ai/flashinfer/releases/download/v0.0.8/flashinfer-0.0.8+cu121torch2.3-cp310-cp310-linux_x86_64.whl
-
pytest -v -s kernels --shard-id=$$BUILDKITE_PARALLEL_JOB --num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT
parallelism
:
4
#
- label: Kernels Test %N
#
#mirror_hardwares: [amd]
#
commands:
#
- pip install https://github.com/flashinfer-ai/flashinfer/releases/download/v0.0.8/flashinfer-0.0.8+cu121torch2.3-cp310-cp310-linux_x86_64.whl
#
- pytest -v -s kernels --shard-id=$$BUILDKITE_PARALLEL_JOB --num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT
#
parallelism: 4
-
label
:
Models Test
#mirror_hardwares: [amd]
...
...
@@ -202,20 +202,20 @@ steps:
-
export VLLM_ATTENTION_BACKEND=XFORMERS
-
pytest -v -s spec_decode
-
label
:
LoRA Test %N
#mirror_hardwares: [amd]
command
:
pytest -v -s lora --shard-id=$$BUILDKITE_PARALLEL_JOB --num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT --ignore=lora/test_long_context.py
parallelism
:
4
-
label
:
LoRA Long Context (Distributed)
#mirror_hardwares: [amd]
num_gpus
:
4
# This test runs llama 13B, so it is required to run on 4 GPUs.
commands
:
# FIXIT: find out which code initialize cuda before running the test
# before the fix, we need to use spawn to test it
-
export VLLM_WORKER_MULTIPROC_METHOD=spawn
-
pytest -v -s -x lora/test_long_context.py
#
- label: LoRA Test %N
#
#mirror_hardwares: [amd]
#
command: pytest -v -s lora --shard-id=$$BUILDKITE_PARALLEL_JOB --num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT --ignore=lora/test_long_context.py
#
parallelism: 4
#
- label: LoRA Long Context (Distributed)
#
#mirror_hardwares: [amd]
#
num_gpus: 4
#
# This test runs llama 13B, so it is required to run on 4 GPUs.
#
commands:
#
# FIXIT: find out which code initialize cuda before running the test
#
# before the fix, we need to use spawn to test it
#
- export VLLM_WORKER_MULTIPROC_METHOD=spawn
#
- pytest -v -s -x lora/test_long_context.py
-
label
:
Tensorizer Test
#mirror_hardwares: [amd]
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment