README.md

# README for Evaluation

## 🌟 Overview

This script provides an evaluation pipeline for `MVBench`.

## 🗂️ Data Preparation

Before starting to download the data, please create the `InternVL/internvl_chat/data` folder.

### MVBench

Follow the instructions below to prepare the data:

```shell
# Step 1: Download the dataset
cd data/
huggingface-cli download --repo-type dataset --resume-download OpenGVLab/MVBench --local-dir MVBench --local-dir-use-symlinks False

# Step 2: Unzip videos
cd MVBench/video/
for file in *.zip; do unzip "$file" -d "${file%.*}"; done
cd ../../..
```

After preparation is complete, the directory structure is:

```shell
data/MVBench
├── json
│   ├── action_antonym.json
│   ├── action_count.json
│   ├── action_localization.json
│   ├── action_prediction.json
│   ├── action_sequence.json
│   ├── character_order.json
│   ├── counterfactual_inference.json
│   ├── egocentric_navigation.json
│   ├── episodic_reasoning.json
│   ├── fine_grained_action.json
│   ├── fine_grained_pose.json
│   ├── moving_attribute.json
│   ├── moving_count.json
│   ├── moving_direction.json
│   ├── object_existence.json
│   ├── object_interaction.json
│   ├── object_shuffle.json
│   ├── scene_transition.json
│   ├── state_change.json
│   └── unexpected_action.json
├── README.md
└── video
    ├── clevrer
    ├── FunQA_test
    ├── Moments_in_Time_Raw
    ├── nturgbd
    ├── perception
    ├── scene_qa
    ├── ssv2_video
    ├── sta
    ├── star
    ├── tvqa
    └── vlnqa
```

## 🏃 Evaluation Execution

To run the evaluation, execute the following command on an 8-GPU setup:

```shell
torchrun --nproc_per_node=8 eval/mvbench/evaluate_mvbench.py --checkpoint ${CHECKPOINT} --num_segments 16
```

Alternatively, you can run the following simplified command:

```shell
GPUS=8 sh evaluate.sh ${CHECKPOINT} mvbench
```

### Arguments

The following arguments can be configured for the evaluation script:

| Argument         | Type   | Default     | Description                                                                                                       |
| ---------------- | ------ | ----------- | ----------------------------------------------------------------------------------------------------------------- |
| `--checkpoint`   | `str`  | `''`        | Path to the model checkpoint.                                                                                     |
| `--datasets`     | `str`  | `'mvbench'` | Comma-separated list of datasets to evaluate.                                                                     |
| `--dynamic`      | `flag` | `False`     | Enables dynamic high resolution preprocessing.                                                                    |
| `--max-num`      | `int`  | `1`         | Maximum tile number for dynamic high resolution.                                                                  |
| `--load-in-8bit` | `flag` | `False`     | Loads the model weights in 8-bit precision.                                                                       |
| `--auto`         | `flag` | `False`     | Automatically splits a large model across 8 GPUs when needed, useful for models too large to fit on a single GPU. |