Unverified Commit 3fd1d4ec authored by Charlie Fu's avatar Charlie Fu Committed by GitHub
Browse files

[Rocm][CI] Fix LM Eval Large Models (H100) test group (#34750)


Signed-off-by: default avatarcharlifu <charlifu@amd.com>
parent cb21972a
Meta-Llama-4-Maverick-17B-128E-Instruct-FP8.yaml Meta-Llama-4-Maverick-17B-128E-Instruct-FP8.yaml
Qwen3-235B-A22B-Instruct-2507-FP8.yaml
...@@ -1544,8 +1544,8 @@ steps: ...@@ -1544,8 +1544,8 @@ steps:
- export VLLM_WORKER_MULTIPROC_METHOD=spawn - export VLLM_WORKER_MULTIPROC_METHOD=spawn
- pytest -s -v test_lm_eval_correctness.py --config-list-file=configs/models-large.txt --tp-size=4 - pytest -s -v test_lm_eval_correctness.py --config-list-file=configs/models-large.txt --tp-size=4
##### H100 test ##### ##### FP8 test #####
- label: LM Eval Large Models (H100) # optional - label: LM Eval Large Models (H100) # optional, still use H100 for consistency
gpu: h100 gpu: h100
optional: true optional: true
mirror_hardwares: [amdexperimental, amdproduction] mirror_hardwares: [amdexperimental, amdproduction]
...@@ -1557,8 +1557,8 @@ steps: ...@@ -1557,8 +1557,8 @@ steps:
- csrc/ - csrc/
- vllm/model_executor/layers/quantization - vllm/model_executor/layers/quantization
commands: commands:
- export VLLM_USE_DEEP_GEMM=0 # We found Triton is faster than DeepGEMM for H100 - export VLLM_USE_DEEP_GEMM=0
- pytest -s -v test_lm_eval_correctness.py --config-list-file=configs/models-large-hopper.txt --tp-size=4 - pytest -s -v test_lm_eval_correctness.py --config-list-file=configs/models-large-rocm.txt --tp-size=4
##### H200 test ##### ##### H200 test #####
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment