Add notes on file structure in Voxceleb1 based datasets (#2776)
Summary:
The file structure of VoxCeleb1 is as follows:
```
root/
└── wav/
└── speaker_id folders
```
Users who use [Kaldi](https://github.com/kaldi-asr/kaldi/blob/f6f4ccaf213f0fe8b26e633a7dc0c802150626a0/egs/voxceleb/v1/local/make_voxceleb1_v2.pl) to get the VoxCeleb1 dataset have "dev" and "test" folders above "wav" folder. However, in the file lists like https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/veri_test.txt or https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/iden_split.txt there is not such differentiation. It's not necessary to put the extracted files into separate folders.
This PR adds notes in `VoxCeleb1Identification` and `VoxCeleb1Verification` datasets to inform the file structure to users.
Pull Request resolved: https://github.com/pytorch/audio/pull/2776
Reviewed By: carolineechen
Differential Revision: D40483707
Pulled By: nateanl
fbshipit-source-id: ccd1780a72a5b53f0300c2466c3073a293ad7b8d
Showing
Please register or sign in to comment