"torchvision/vscode:/vscode.git/clone" did not exist on "750e38f5056e6699ac76a8ded661d1ff1646fbfc"
README.md 848 Bytes
Newer Older
yangzhong's avatar
yangzhong committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
## Generate Training Data

### Step 1: Create the environment
```
conda create -n make_data python=3.10
conda activate make_data
bash build.sh
```

### Step 2: Prepare CSV File
Create a CSV file listing the paths to ground-truth (GT) videos and their corresponding text descriptions. Use the following format:
```
path,text
/xxx/xxx/dog.mp4, A dog is sitting on the couch.
...
```

### Step 3: Configure Paths
Open `make_paired_data.sh` and modify the following variables:

- `INPUT_CSV`: Path to your CSV file
- `SAVE_PATH`: Directory to save the generated paired data

### Step 4: Run the Script
```
bash make_paired_data.sh
```

⚠️ **Notice:** The current version of `make_paired_data.sh` only supports `batch_size=1`.  
To process data in parallel, you can split the CSV file into multiple parts and run the script separately on each part.