tests_style.md

# Test File Structure and Style Guide

To ensure project maintainability and sustainable development, we encourage contributors to submit test code (unit tests, system tests, or end-to-end tests) alongside their code changes. This document outlines the guidelines for organizing and naming test files.

## Test Types

### Unit Tests and System Tests
For unit tests and system tests, we strongly recommend placing test files in the same directory structure as the source code being tested, using the naming convention `test_*.py`.

### End-to-End (E2E) Tests for Models
End-to-end tests verify the complete functionality of a system or component. For our project, the E2E tests for different omni models are organized into two subdirectories:

- **`tests/e2e/offline_inference/`**: Tests for offline inference modes (e.g., Qwen3Omni offline inference)

- **`tests/e2e/online_serving/`**: Tests for online serving scenarios (e.g., API server tests)

**Example:** The test file for `vllm_omni/entrypoints/omni_llm.py` should be located at `tests/entrypoints/test_omni_llm.py`.

## Test Directory Structure

The ideal directory structure mirrors the source code organization:

```
vllm_omni/                          tests/
├── config/                    →    ├── config/
│   └── model.py                    │   └── test_model.py
│
├── core/                      →    ├── core/
│   └── sched/                      │   └── sched/                    # Maps to core/sched/
│       ├── omni_ar_scheduler.py    │       ├── test_omni_ar_scheduler.py
│       ├── omni_generation_scheduler.py │  ├── test_omni_generation_scheduler.py
│       └── output.py               │       └── test_output.py
│
├── diffusion/                 →    ├── diffusion/
│   ├── diffusion_engine.py         │   ├── test_diffusion_engine.py
│   ├── omni_diffusion.py           │   ├── test_omni_diffusion.py
│   ├── attention/                  │   ├── attention/                # Maps to diffusion/attention/
│   │   └── backends/               │   │   └── test_*.py
│   ├── models/                     │   ├── models/                   # Maps to diffusion/models/
│   │   ├── qwen_image/             │   │   ├── qwen_image/
│   │   │   └── ...                 │   │   │   └── test_*.py
│   │   └── z_image/                │   │   └── z_image/
│   │       └── ...                 │   │       └── test_*.py
│   └── worker/                     │   └── worker/                   # Maps to diffusion/worker/
│       └── ...                     │       └── test_*.py
│
├── distributed/               →    ├── distributed/
│   └── ...                         │   └── test_*.py
│
├── engine/                    →    ├── engine/
│   ├── processor.py                │   ├── test_processor.py
│   └── output_processor.py         │   └── test_output_processor.py
│
├── entrypoints/               →    ├── entrypoints/
│   ├── omni_llm.py                 │   ├── test_omni_llm.py          # UT: OmniLLM core logic (mocked)
│   ├── omni_stage.py               │   ├── test_omni_stage.py         # UT: OmniStage logic
│   ├── omni.py                     │   ├── test_omni.py               # E2E: Omni class (offline inference)
│   ├── async_omni.py               │   ├── test_async_omni.py         # E2E: AsyncOmni class
│   ├── cli/                        │   ├── cli/                       # Maps to entrypoints/cli/
│   │   └── ...                     │   │   └── test_*.py
│   └── openai/                     │   └── openai/                     # Maps to entrypoints/openai/
│       ├── api_server.py           │       ├── test_api_server.py     # E2E: API server (online serving)
│       └── serving_chat.py         │       └── test_serving_chat.py
│
├── inputs/                    →    ├── inputs/
│   ├── data.py                     │   ├── test_data.py
│   ├── parse.py                    │   ├── test_parse.py
│   └── preprocess.py               │   └── test_preprocess.py
│
├── model_executor/            →    ├── model_executor/
│   ├── layers/                     │   ├── layers/
│   │   └── mrope.py                │   │   └── test_mrope.py
│   ├── model_loader/               │   ├── model_loader/
│   │   └── weight_utils.py         │   │   └── test_weight_utils.py
│   ├── models/                     │   ├── models/
│   │   ├── qwen2_5_omni/           │   │   ├── qwen2_5_omni/
│   │   │   ├── qwen2_5_omni_thinker.py │ │   │   ├── test_qwen2_5_omni_thinker.py  # UT
│   │   │   ├── qwen2_5_omni_talker.py │ │   │   ├── test_qwen2_5_omni_talker.py  # UT
│   │   │   └── qwen2_5_omni_token2wav.py │ │   │   └── test_qwen2_5_omni_token2wav.py  # UT
│   │   └── qwen3_omni/             │   │   └── qwen3_omni/
│   │       └── ...                 │   │       └── test_*.py
│   ├── stage_configs/              │   └── stage_configs/             # Configuration tests (if needed)
│   │   └── ...                     │       └── test_*.py
│   └── stage_input_processors/     │   └── stage_input_processors/
│       └── ...                     │       └── test_*.py
│
├── sample/                    →    ├── sample/
│   └── ...                         │   └── test_*.py
│
├── utils/                     →    ├── utils/
│   └── platform_utils.py           │   └── test_platform_utils.py
│
├── worker/                    →    ├── worker/
    ├── gpu_ar_worker.py            │   ├── test_gpu_ar_worker.py
    ├── gpu_generation_worker.py    │   ├── test_gpu_generation_worker.py
    ├── gpu_model_runner.py         │   ├── test_gpu_model_runner.py
    └── npu/                        │   └── npu/                       # Maps to worker/npu/
        └── ...                     │       └── test_*.py
│
└── e2e/                       →    ├── e2e/                # End-to-end scenarios (no 1:1 source mirror)
                                    ├── online_serving/       # Full-stack online serving flows
                                    │   └── (empty for now)
                                    └── offline_inference/    # Full offline inference flows
                                        ├── test_qwen2_5_omni.py     # Moved from multi_stages/
                                        ├── test_qwen3_omni.py       # Moved from multi_stages_h100/
                                        ├── test_t2i_model.py  # Moved from single_stage/
                                        └── stage_configs/           # Shared stage configs
                                            ├── qwen2_5_omni_ci.yaml
                                            └── qwen3_omni_ci.yaml
```


### Naming Conventions

- **Unit Tests**: Use `test_<module_name>.py` format. Example: `omni_llm.py` → `test_omni_llm.py`

- **E2E Tests**: Place in `tests/e2e/offline_inference/` or `tests/e2e/online_serving/` with descriptive names. Example: `tests/e2e/offline_inference/test_qwen3_omni.py`, `tests/e2e/offline_inference/test_diffusion_model.py`

### Best Practices

1. **Mirror Source Structure**: Test directories should mirror the source code structure
2. **Test Type Indicators**: Use comments to indicate test types (UT for unit tests, E2E for end-to-end tests)
3. **Shared Resources**: Place shared test configurations (e.g., CI configs) in appropriate subdirectories
4. **Consistent Naming**: Follow the `test_*.py` naming convention consistently across all test files


## Test codes requirements

### Coding style

1. **File header**: Add SPDX license header to all test files
2. **Imports**: Pls don't use manual `sys.path` modifications, use standard imports instead.
3. **Test type differentiation**:

      - Unit tests: Maintain mock style
      - E2E tests for models: Consider using OmniRunner uniformly, avoid decorators

4. **Documentation**: Add docstrings to all test functions
5. **Environment variables**: Set uniformly in `conftest.py` or at the top of files
6. **Type annotations**: Add type annotations to all test function parameters
7. **Pytest Markers**: Add necessary markers like `@pytest.mark.core_model` and use `@hardware_test` to declare hardware requirements (check detailed in [Markers for Tests](../ci/tests_markers.md)).

### Template
#### E2E - Online serving

```python
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""
Online E2E smoke test for an omni model (video,text,audio → audio).
"""
from pathlib import Path

import pytest
import openai

from tests.utils import hardware_test

# Optional: set process start method for workers
os.environ["VLLM_WORKER_MULTIPROC_METHOD"] = "spawn"

models = ["{your model name}"] #Edit here to load your model
stage_configs = [str(Path(__file__).parent / "stage_configs" / {your model yaml})] #Edit here to load your model yaml
test_params = [(model, stage_config) for model in models for stage_config in stage_configs]

#OmniServer，Used to start the vllm-omni server
class OmniServer:
    xxx


@pytest.fixture
def omni_server(request):
    model, stage_config_path = request.param
    with OmniServer(model, ["--stage-configs-path", stage_config_path]) as server:
        yield server


#handle request message
@pytest.fixture(scope="session")
def base64_encoded_video() -> str:
    xxx

@pytest.fixture(scope="session")
def dummy_messages_from_video_data(video_data_url: str, content_text: str) -> str:
    xxx

@pytest.mark.core_model
@pytest.mark.omni
@hardware_test(
    res={"cuda": "L4", "rocm": "MI325", "npu": "A2"},
    num_cards={"cuda": 2, "rocm": 2, "npu": 4},
)
@pytest.mark.parametrize("omni_server", test_params, indirect=True)
def test_video_to_audio(
    client: openai.OpenAI,
    omni_server,
    base64_encoded_video: str,
) -> None:
    #set message
    video_data_url = f"data:video/mp4;base64, {base64_encoded_video}"
    messages = dummy_messages_from_video_data(video_data_url)

    #send request
    chat_completion = client.chat.completions.create(
        model=omni_server.model,
        messages=messages,
    )

    #verify text output
    text_choice = chat_completion.choices[0]
    assert text_choice.finish_reason == "length"

    #verify audio output
    audio_choice = chat_completion.choices[1]
    audio_message = audio_choice.message
    if hasattr(audio_message, "audio") and audio_message.audio:
        assert audio_message.audio.data is not None
        assert len(audio_message.audio.data) > 0
```

#### E2E - Offline inference
```python
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""
Offline E2E smoke test for an omni model (video → audio).
"""

import os
from pathlib import Path

import pytest
from vllm.assets.video import VideoAsset

from tests.utils import hardware_test
from ..multi_stages.conftest import OmniRunner

# Optional: set process start method for workers
os.environ["VLLM_WORKER_MULTIPROC_METHOD"] = "spawn"

models = ["{your model name}"] #Edit here to load your model
stage_configs = [str(Path(__file__).parent / "stage_configs" / {your model yaml})] #Edit here to load your model yaml

# Create parameter combinations for model and stage config
test_params = [(model, stage_config) for model in models for stage_config in stage_configs]

# function name: test_{input_modality}_to_{output_modality}
# modality candidate: text, image, audio, video, mixed_modalities
@pytest.mark.core_model
@pytest.mark.omni
@hardware_test(
    res={"cuda": "L4", "rocm": "MI325", "npu": "A2"},
    num_cards=2,
)
@pytest.mark.parametrize("test_config", test_params)
def test_video_to_audio(omni_runner: type[OmniRunner], model: str) -> None:
    """Offline inference: video input, audio output."""
    model, stage_config_path = test_config
    with omni_runner(model, seed=42, stage_configs_path=stage_config_path) as runner:
        # Prepare inputs
        video = VideoAsset(name="sample", num_frames=4).np_ndarrays

        outputs = runner.generate_multimodal(
            prompts="Describe this video briefly.",
            videos=video,
        )

        # Minimal assertions: got outputs and at least one audio result
        assert outputs
        has_audio = any(o.final_output_type == "audio" for o in outputs)
        assert has_audio
```

## Checklist before submitting your test files

1. The file is saved in an appropriate place and the file name is clear.
2. The coding style follows the requirements outlined above.
3. **All test functions have appropriate pytest markers**
4. For tests that need run in CI, please ensure the test is configured under the `./buildkite/` folder.