# egs2 (Examples of ESPnet2)

## How to use?

See: https://espnet.github.io/espnet/espnet2_tutorial.html#recipes-using-espnet2

## Overview of example information

| Directory name          | Corpus name                                                                                                                      | Task                    | Language              | URL                                                                                                          | Note         |
|-------------------------|----------------------------------------------------------------------------------------------------------------------------------|-------------------------| --------------------- | ------------------------------------------------------------------------------------------------------------ | ------------ |
| accented_french_openslr57 | African Accented French Corpus                                                                                                 | ASR                     | FRA                   | https://www.openslr.org/57/                                                                                  |              |
| aesrc2020               | Accented English Speech Recognition Challenge 2020                                                                               | ASR                     | ENG                   | https://arxiv.org/abs/2102.10233                                                                             |              |
| aidatatang_200zh        | Aidatatang_200zh A free Chinese Mandarin speech corpus                                                                           | ASR                     | CMN                   | http://www.openslr.org/resources/62                                                                          |              |
| aishell                 | AISHELL-ASR0009-OS1 Open Source Mandarin Speech Corpus                                                                           | ASR                     | CMN                   | http://www.aishelltech.com/kysjcp                                                                            |              |
| aishell2                | AISHELL-2 Open Source Mandarin Speech Corpus                                                                                     | ASR                     | CMN                   | https://www.aishelltech.com/aishell_2                                                                        |              |
| aishell3                | AISHELL3 Mandarin multi-speaker text-to-speech                                                                                   | TTS                     | CMN                   | https://www.openslr.org/93/                                                                                  |              |
| aishell4                | AISHELL4 Open Source Mandarin Speech Corpus in Conference Scenario                                                               | ASR/SE                  | CMN                   | https://www.openslr.org/111/                                                                                 |              |
| americasnlp22           | The Second AmericasNLP Competition                                                                                               | ASR                     | BZD, GUG, GVC, QWE, TAV | http://turing.iimas.unam.mx/americasnlp/st.html                                                            |              |
| ami                     | The AMI Meeting Corpus                                                                                                           | ASR                     | ENG                   | http://groups.inf.ed.ac.uk/ami/corpus/                                                                       |              |
| an4                     | CMU AN4 database                                                                                                                 | ASR/TTS                 | ENG                   | http://www.speech.cs.cmu.edu/databases/an4/                                                                  |              |
| aphasiabank             | AphasiaBank database (English)                                                                                                   | ASR                     | ENG                   | https://aphasia.talkbank.org/                                                                                |              |
| babel                   | IARPA Babel corups                                                                                                               | ASR                     | ~20 languages         | https://www.iarpa.gov/index.php/research-programs/babel                                                      |              |
| bn_openslr53            | Large bengali ASR training dataset                                                                                               | ASR                     | BEN                   | https://openslr.org/53/                                                                                      |              |
| bur_openslr80           | Burmese ASR training dataset                                                                                                     | ASR                     | BUR                   | https://openslr.org/80/                                                                                      |              |
| catslu               	  | CATSLU-MAPS                                                                                                                      | SLU                     | CMN           	       | https://sites.google.com/view/catslu/home                                                                    |              |
| catslu_entity        	  | CATSLU                                                                                                                           | SLU/Entity Classifi.    | CMN           	       | https://sites.google.com/view/catslu/home                                                                    |              |
| chime4                  | The 4th CHiME Speech Separation and Recognition Challenge                                                                        | ASR/Multichannel ASR    | ENG                   | http://spandh.dcs.shef.ac.uk/chime_challenge/chime2016/                                                      |              |
| chime6                  | The 6th CHiME Speech Separation and Recognition Challenge                                                                        | ASR                     | ENG                   | https://chimechallenge.github.io/chime6/                                                                     |              |
| clarity21               | The First Clarity Enhancement Challenge CEC1                                                                                     | SE                      | ENG                   | https://claritychallenge.github.io/clarity_CEC1_doc/                                                         |              |
| cmu_arctic              | CMU ARCTIC                                                                                                                       | TTS                     | ENG                   | http://www.festvox.org/cmu_arctic/                                                                           |              |
| cmu_indic               | CMU INDIC                                                                                                                        | TTS                     | 7 languages           | http://festvox.org/cmu_indic/                                                                                |              |
| commonvoice             | The Mozilla Common Voice                                                                                                         | ASR                     | 13 languages          | https://voice.mozilla.org/datasets                                                                           |              |
| conferencingspeech21    | Far-field Multi-channel Speech Enhancement Challenge for Video Conferencing (ConferencingSpeech 2021)                            | SE                      | ENG, CMN              | https://tea-lab.qq.com/conferencingspeech-2021                                                               |              |
| covost2                 | Multilingual speech-to-text translation corpus from Common Voice                                                                 | ST                      | lang pairs from 22    | https://github.com/facebookresearch/covost                                                                   |              |
| csj                     | Corpus of Spontaneous Japanese                                                                                                   | ASR                     | JPN                   | https://pj.ninjal.ac.jp/corpus_center/csj/en/                                                                |              |
| csmsc                   | Chinese Standard Mandarin Speech Copus                                                                                           | TTS                     | CMN                   | https://www.data-baker.com/open_source.html                                                                  |              |
| css10                   | CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages                                                           | TTS                     | 10 langauges          | https://github.com/Kyubyong/css10                                                                            |              |
| dcase22_task1       | DCASE Task1 2022 Dataset                                                                                                             | SLU                     | ENG                   | https://dcase.community/challenge2022/task-low-complexity-acoustic-scene-classification                      |              |
| dirha_wsj               | Distant-speech Interaction for Robust Home Applications                                                                          | Multichannel ASR        | ENG                   | https://dirha.fbk.eu/, https://github.com/SHINE-FBK/DIRHA_English_wsj                                        |              |
| dns_ins20               | Deep Noise Suppression Challenge – INTERSPEECH 2020                                                                              | SE                      | 7 languages +singing  | https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2020/ |              |
| dns_icassp21            | Deep Noise Suppression Challenge – ICASSP 2021                                                                                   | SE                      | 11 languages + singing| https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-icassp-2021/      |              |
| dns_icassp22            | Deep Noise Suppression Challenge – ICASSP 2022                                                                                   | SE                        | 11 languages + singing| https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-icassp-2022/    |              |
| dns_ins20               | Deep Noise Suppression Challenge – INTERSPEECH 2020                                                                              | SE                      | 11 languages + singing| https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2021/ |              |
| dns_ins21               | Deep Noise Suppression Challenge – INTERSPEECH 2021                                                                              | SE                      | 11 languages + singing| https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2021/ |              |
| dsing                   | Automatic Lyric Transcription from Karaoke Vocal Tracks (From DAMP Sing300x30x2)                                                 | ASR (ALT)               | ENG singing           | https://github.com/groadabike/Kaldi-Dsing-task                                                               |              |
| fisher_callhome_spanish | Fisher and CALLHOME Spanish--English Speech Translation                                                                          | ASR/ST                  | SPA->ENG              | https://catalog.ldc.upenn.edu/LDC2014T23                                                                     |              |
| fleurs                  | Few-shot Learning Evaluation of Universal Representations of Speech                                                              | ASR/Multilingual        | 102 languages         | https://huggingface.co/datasets/google/fleurs                                                                |              |
| fsc                     | Fluent Speech Commands Dataset                                                                                                   | SLU                     | ENG                   | https://fluent.ai/fluent-speech-commands-a-dataset-for-spoken-language-understanding-research/               |              |
| fsc_challenge           | Fluent Speech Commands Dataset MASE Eval Challenge splits                                                                        | SLU                     | ENG                   | https://github.com/maseEval/mase                                                                             |              |
| fsc_unseen              | Fluent Speech Commands Dataset MASE Eval Unseen splits                                                                           | SLU                     | ENG                   | https://github.com/maseEval/mase                                                                             |              |
| gigaspeech              | GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio                                          | ASR                     | ENG                   | https://github.com/SpeechColab/GigaSpeech                                                                    |              |
| googlei18n_lowresource  | Googlei18n crowdsource project                                                                                                   | TTS                     | ENG                   | https://github.com/mirumee/google-i18n-address (most in openslr as separate entries)                         |              |
| grabo                   | Grabo dataset                                                                                                                    | SLU                     | ENG + NLD             | https://www.esat.kuleuven.be/psi/spraak/downloads/                                                           |              |
| harpervalley             | HarperValleyBank: A Domain-Specific Spoken Dialog Corpus                                                                            | SLU                     | ENG                   | https://github.com/cricketclub/gridspace-stanford-harper-valley                                                       |              |
| hkust                   | HKUST/MTS: A very large scale Mandarin telephone speech corpus                                                                   | ASR                     | CMN                   | https://catalog.ldc.upenn.edu/LDC2005S15                                                                     |              |
| how2                    | How2: A Large-scale Dataset for Multimodal Language Understanding                                                                | ASR/MT/ST               | ENG->POR              | https://github.com/srvk/how2-dataset                                                                         |              |
| how2_2000h              | How2_2000h fbank features                                                                                                        | ASR/SUM                 | ENG->POR              | https://arxiv.org/pdf/2110.06263.pdf                                                                         |              |
| hub4_spanish            | 1997 Spanish Broadcase News Speech                                                                                               | ASR                     | SPA                   | https://catalog.ldc.upenn.edu/LDC98S74                                                                       |              |
| hui_acg                 | HUI-audio-corpus-german                                                                                                          | TTS                     | DEU                   | https://opendata.iisys.de/datasets.html#hui-audio-corpus-german                                              |              |
| iam                     | IAM Handwriting Database 3.0                                                                                                     | OCR                     | ENG                   | https://fki.tic.heia-fr.ch/databases/iam-handwriting-database                                                |              |
| iemocap                 | IEMOCAP database: The Interactive Emotional Dyadic Motion Capture database                                                       | SLU                     | ENG                   | https://sail.usc.edu/iemocap/                                                                                |              |
| indic_speech            | IndicSpeech: Text-to-Speech Corpus for Indian Languages                                                                          | TTS                     | 3 indic languages    | http://cvit.iiit.ac.in/research/projects/cvit-projects/text-to-speech-dataset-for-indian-languages            |              |
| iwslt14                 | IWSLT14 MT shared task                                                                                                           | MT                      | DEU->ENG             | http://dl.fbaipublicfiles.com/fairseq/data/iwslt14/de-en.tgz                                                  |              |
| iwslt21_low_resource    | ALFFA, IARPA Babel, Gamayun, IWSLT 2021                                                                                          | ASR                     | SWA                   | http://www.openslr.org/25/ https://catalog.ldc.upenn.edu/LDC2017S05 https://gamayun.translatorswb.org/data/ https://iwslt.org/2021/low-resource |              |
| iwslt22_dialect         | IWSLT2022 dialectal speech translation shared task                                                                               | ASR/ST                  | ARA->Tunisian ARA     | https://github.com/kevinduh/iwslt22-dialect.git                                                              |              |
| iwslt22_low_resource | IWSLT2022 Low-resource speech translation track task                                                                               | ST                  | Tamasheq->FrenchPermalink     | https://github.com/mzboito/IWSLT2022_Tamasheq_data.git |
| jdcinal                 | Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags | SLU                     | JPN                   | http://www.lrec-conf.org/proceedings/lrec2018/pdf/464.pdf http://tts.speech.cs.cmu.edu/awb/infomation_navigation_and_attentive_listening_0.2.zip |              |
| jkac                    | J-KAC: Japanese Kamishibai and audiobook corpus                                                                                  | TTS                     | JPN                  | https://sites.google.com/site/shinnosuketakamichi/research-topics/j-kac_corpus                               |              |
| jmd                     | JMD: Japanese multi-dialect corpus for speech synthesis                                                                          | TTS                     | JPN                  | https://sites.google.com/site/shinnosuketakamichi/research-topics/jmd_corpus                                 |              |
| jsss                    | JSSS: Japanese speech corpus for summarization and simplification                                                                | TTS                     | JPN                  | https://sites.google.com/site/shinnosuketakamichi/research-topics/jsss_corpus                                |              |
| jsut                    | Japanese speech corpus of Saruwatari-lab., University of Tokyo                                                                   | ASR/TTS                 | JPN                  | https://sites.google.com/site/shinnosuketakamichi/publication/jsut                                           |              |
| jtubespeech             | Japanese YouTube Speech corpus                                                                                                   | ASR/TTS                 | JPN                  |                                                                                                              |              |
| jv_openslr35            | Javanese                                                                                                                         | ASR                     | JAV                  | http://www.openslr.org/35                                                                                    |              |
| jvs                     | JVS (Japanese versatile speech) corpus                                                                                           | TTS                     | JPN                  | https://sites.google.com/site/shinnosuketakamichi/research-topics/jvs_corpus                                 |              |
| ksponspeech             | KsponSpeech (Korean spontaneous speech) corpus                                                                                   | ASR                     | KOR                  | https://aihub.or.kr/aidata/105                                                                               |              |
| kss                     | Korean single speaker corpus                                                                                                     | TTS                     | KOR                  | https://www.kaggle.com/bryanpark/korean-single-speaker-speech-dataset                                        |              |
| l3das22                | L3DAS22: Machine Learning for 3D Audio Signal Processing - ICASSP 2022                                                            | SE                     | ENG                  | https://www.l3das.com/icassp2022/                                                                            |              |
| laborotv                | LaboroTVSpeech (A large-scale Japanese speech corpus on TV recordings)                                                           | ASR                     | JPN                  | https://laboro.ai/column/eg-laboro-tv-corpus-jp                                                              |              |
| librispeech             | Librilight-limited subset                                                                                                        | ASR                     | ENG                  | https://dl.fbaipublicfiles.com/librilight/data/librispeech_finetuning.tgz                                    |              |
| librimix                | LibriMix: An Open-Source Dataset for Generalizable Speech Separation                                                             | SE/DIAR                 | ENG                  | https://github.com/JorisCos/LibriMix                                                                         |              |
| librispeech             | LibriSpeech ASR corpus                                                                                                           | ASR                     | ENG                  | http://www.openslr.org/12                                                                                    |              |
| librispeech_100         | LibriSpeech ASR corpus 100h subset                                                                                               | ASR                     | ENG                  | http://www.openslr.org/12                                                                                    |              |
| libritts                | LibriTTS corpus                                                                                                                  | TTS                     | ENG                  | http://www.openslr.org/60                                                                                    |              |
| ljspeech                | The LJ Speech Dataset                                                                                                            | TTS                     | ENG                  | https://keithito.com/LJ-Speech-Dataset/                                                                      |              |
| lrs2                    | The Oxford-BBC Lip Reading Sentences 2 (LRS2) Dataset                                                                            | Lipreading/ASR          | ENG                  | https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs2.html                                                  |              |
| lrs3                    | The Oxford-BBC Lip Reading Sentences 3 (LRS3) Dataset                                                                            | ASR                     | ENG                  | https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs3.html                                                  |              |
| lt_slurp_spatialized    | Spatialized Libri-Trans and Spatialized SLURP (LT-S and SLURP-S), Enhancement for Translation and Understanding Dataset          | SE/ST/SLU               | ENG                  |                                                                                                              |              |
| magicdata               | MAGICDATA Mandarin Chinese Read Speech Corpus                                                                                    | ASR                     | ENG                  | https://www.openslr.org/68/                                                                                  |              |
| media                   | MEDIA speech database for French                                                                                                 | SLU/Entity Classifi.    | FRA                  | https://catalogue.elra.info/en-us/repository/browse/ELRA-S0272/                                              |              |
| mediaspeech             | MediaSpeech: Multilanguage ASR Benchmark and Dataset                                                                             | ASR                     | FRA                  | https://www.openslr.org/108/                                                                                 |              |
| meld                    | MELD: Multimodal EmotionLines Dataset                                                                                            | SLU                     | ENG                  | https://affective-meld.github.io/									     |              |
| microsoft_speech        | Microsoft Speech Corpus (Indian languages)                                                                                       | ASR                     | 3 languages          | https://msropendata.com/datasets/7230b4b1-912d-400e-be58-f84e0512985e                                        |              |
| mini_an4                | Mini version of CMU AN4 database for the integration test                                                                        | ASR/TTS/SE              | ENG                  | http://www.speech.cs.cmu.edu/databases/an4/                                                                  |              |
| mini_librispeech        | Mini version of Librispeech corpus                                                                                               | DIAR                    | ENG                  | https://openslr.org/31/                                                                                      |              |
| misp2021            | Multimodal Information Based Speech Processing (MISP) Challenge 2021                                                                 | ASR/AVSR                   | MAL                  | https://mispchallenge.github.io/                                                                           |              |
| ml_openslr63            | Crowdsourced high-quality Malayalam multi-speaker speech data                                                                    | ASR                     | MAL                  | https://openslr.org/63/                                                                                      |              |
| mls                     | MLS (A large multilingual corpus derived from LibriVox audiobooks)                                                               | ASR                     | 8 languages          | http://www.openslr.org/94/                                                                                   |              |
| mr_openslr64            | OpenSLR Marathi Corpus                                                                                                           | ASR                     | MAR                  | http://www.openslr.org/64/                                                                                   |              |
| ms_indic_is18           | Microsoft Speech Corpus (Indian languages)                                                                                       | ASR                     | 3 langs: TEL TAM GUJ | https://msropendata.com/datasets/7230b4b1-912d-400e-be58-f84e0512985e                                        |              |
| ml_superb               | Multilingual SUPERB benchamrk                                                                                                    | ASR                     | 145 languages        | Not Released                                                                                                 |              |
| mucs21_subtask1         | MUltilingual and Code-Switching ASR Challenges for Low Resource Indian Languages                                                 | ASR                     | 6 indian languages   | https://navana-tech.github.io/MUCS2021/challenge_details.html                                                |              |
| mucs21_subtask2         | MUltilingual and Code-Switching ASR Challenges for Low Resource Indian Languages                                                 | ASR                     | 2 codeswitching data | https://navana-tech.github.io/MUCS2021/challenge_details.html                                                |              |
| must_c                 | https://ict.fbk.eu/must-c/                                                                                                        | ASR/MT/ST               | ENG->14langs         | https://ict.fbk.eu/must-c/                                                                                   |              |
| must_c_v2              | https://ict.fbk.eu/must-c/                                                                                                        | ASR/MT/ST               | ENG->DEU            | https://ict.fbk.eu/must-c/                                                                                    |              |
| nsc                     | National Speech Corpus                                                                                                           | ASR                     | ENG-SG               | https://www.imda.gov.sg/programme-listing/digital-services-lab/national-speech-corpus                        |              |
| ofuton_p_utagoe_db      | Ofuton_p_utagoe Singing voice synthesis corpus                                                                                   | SVS                     | JPN                  | https://sites.google.com/view/oftn-utagoedb/%E3%83%9B%E3%83%BC%E3%83%A0                                      |              |
| open_li110              | Corpus combination with 110 languages                                                                                            | Multilingual ASR        | 100+ languages       |                                                                                                              |              |
| open_li52               | Corpus combination with 52 languages(Commonvocie + voxforge)                                                                     | Multilingual ASR        | 52 languages         |                                                                                                              |              |
| opencpop                | Opencpop: Mandarin singing voice synthesis corpus                                                                                | SVS                     | CMN                  | https://wenet.org.cn/opencpop/                                                                               |              |
| polyphone_swiss_french  | Swiss French Polyphone corpus                                                                                                    | ASR                     | FRA                  | http://catalog.elra.info/en-us/repository/browse/ELRA-S0030_02                                               |              |
| portmedia_dom           | PortMedia French corpus                                                                                                          | SLU/Entity Classifi.    | FRA                  | https://catalogue.elra.info/en-us/repository/browse/ELRA-S0371/                                              |              |
| portmedia_lang          | PortMedia Italian corpus                                                                                                         | SLU/Entity Classifi.    | ITA                  | https://catalogue.elra.info/en-us/repository/browse/ELRA-S0371/                                              |              |
| primewords_chinese      | Primewords Chinese Corpus Set 1                                                                                                  | ASR                     | CMN                  | https://www.openslr.org/47/                                                                                  |              |
| puebla_nahuatl          | Highland Puebla Nahuatl corpus (endangered language in central Mexico)                                                           | ASR/ST                  | HPN                  | https://www.openslr.org/92/                                                                                  |              |
| qasr_tts                | TTS character based system using semi-supervised data selection                                                                  | TTS                     | ARA                  | https://arabicspeech.org/qasr_tts                                                                                  |              |
| reasonspeech            | ReazonSpeech: Japanese Corpus collected from TV Programs                                                                         | ASR                     | JPN                  | https://research.reazon.jp/projects/ReazonSpeech/                                                            |              |
| reverb                  | REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge                                                       | ASR                     | ENG                  | https://reverb2014.dereverberation.com/                                                                      |              |
| ru_open_stt             | Russian Open Speech To Text (STT/ASR) Dataset                                                                                    | ASR                     | RUS                  | https://github.com/snakers4/open_stt                                                                         |              |
| ruslan                  | RUSLAN: Russian Spoken Language Corpus For Speech Synthesis                                                                      | TTS                     | RUS                  | https://ruslan-corpus.github.io/                                                                             |              |
| seame                   | SEAME: a Mandarin-English Code-switching Speech Corpus in South-East Asia                                                        | ASR                     | ENG + CMN            | https://catalog.ldc.upenn.edu/LDC2015S04                                                                     |              |
| sinhala                 | Sinhala speech recognition corpus                                                                                                | ASR                     | SIN                  | https://drive.google.com/file/d/17_e0JhMW4_FPxfh93foplnxb4OQp8zh3/view?usp=sharing                           |              |
| siwis                   | SIWIS: Spoken Interaction with Interpretation in Switzerland                                                                     | TTS                     | FRA                  | https://datashare.ed.ac.uk/handle/10283/2353                                                                 |              |
| slue-voxceleb           | SLUE: Spoken Language Understanding Evaluation                                                                                   | SLU                     | ENG                  | https://github.com/asappresearch/slue-toolkit                                                                |              |
| slue-voxpopuli          | SLUE: Spoken Language Understanding Evaluation                                                                                   | SLU                     | ENG                  | https://github.com/asappresearch/slue-toolkit                                                                |              |
| slurp                   | SLURP: A Spoken Language Understanding Resource Package                                                                          | SLU                     | ENG                  | https://github.com/pswietojanski/slurp                                                                       |              |
| slurp_entity            | SLURP: A Spoken Language Understanding Resource Package                                                                          | SLU/Entity Classifi.    | ENG                  | https://github.com/pswietojanski/slurp                                                                       |              |
| slurp_spatialized       | Spatialized SLURP (SLURP-S), Noisy Reverberan Spoken Language Understanding Dataset                                              | SLU                     | ENG                  |                                                                                                              |              |
| sms_wsj                 | SMS-WSJ: A database for in-depth analysis of multi-channel source separation algorithms                                          | SE                      | ENG                  | https://github.com/fgnt/sms_wsj                                                                              |              |
| snips                   | SNIPS: A dataset for spoken language understanding                                                                               | SLU                     | ENG                  | https://github.com/sonos/spoken-language-understanding-research-datasets                                     |              |
| speechcommands          | Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition                                                             | SLU                     | ENG                  | https://www.tensorflow.org/datasets/catalog/speech_commands                                                  |              |
| spgispeech              | SPGISpeech 5k corpus                                                                                                             | ASR                     | ENG                  | https://datasets.kensho.com/datasets/scribe                                                                  |              |
| su_openslr36            | Sundanese                                                                                                                        | ASR                     | SUN                  | http://www.openslr.org/36                                                                                    |              |
| swbd                    | Switchboard Corpus for 2-channel Conversational Telephone Speech (300h)                                                          | ASR                     | ENG                  | https://catalog.ldc.upenn.edu/LDC97S62                                                                       |              |
| swbd_da                 | NXT Switchboard Annotations                                                                                                      | SLU                     | ENG                  | https://catalog.ldc.upenn.edu/LDC2009T26                                                                     |              |
| swbd_sentiment          | Speech Sentiment Annotations                                                                                                     | SLU                     | ENG                  | https://catalog.ldc.upenn.edu/LDC2020T14                                                                     |              |
| talromur                | Talromur: A large Icelandic TTS corpus                                                                                           | TTS                     | ISL                  | https://repository.clarin.is/repository/xmlui/handle/20.500.12537/104, https://aclanthology.org/2021.nodalida-main.50.pdf                |              |
| talromur2               | Talromur 2: Icelandic multi-speaker TTS corpus                                                                                   | TTS                     | ISL                  | https://repository.clarin.is/repository/xmlui/handle/20.500.12537/167                                        |              |
| tedlium2                | TED-LIUM corpus release 2                                                                                                        | ASR                     | ENG                  | https://www.openslr.org/19/, http://www.lrec-conf.org/proceedings/lrec2014/pdf/1104_Paper.pdf                |              |
| tedlium3                | TED-LIUM corpus release 3                                                                                                        | ASR                     | ENG                  | https://www.openslr.org/51/                |              |
| tedx_spanish_openslr67  | TEDx Spanish Corpus                                                                                                              | ASR                     | SPA                 | https://www.openslr.org/67/                                                                                   |              |
| thchs30                 | A Free Chinese Speech Corpus Released by CSLT@Tsinghua University                                                                | ASR/TTS                 | CMN                  | https://www.openslr.org/18/                                                                                  |              |
| timit                   | TIMIT Acoustic-Phonetic Continuous Speech Corpus                                                                                 | ASR/UASR                | ENG                  | https://catalog.ldc.upenn.edu/LDC93S1                                                                        |              |
| totonac                 | Highland Totonac corpus (endangered language in central Mexico)                                                                  | ASR                     | TOS                  | http://www.openslr.org/107/                                                                                  |              |
| tsukuyomi               | つくよみちゃんコーパス                                                                                                               | TTS                     | JPN                  | https://tyc.rei-yumesaki.net/material/corpus                                                                 |              |
| vctk                    | English Multi-speaker Corpus for CSTR Voice Cloning Toolkit                                                                      | ASR/TTS                 | ENG                  | http://www.udialogue.org/download/cstr-vctk-corpus.html                                                      |              |
| vctk_reverb             | Reverberant speech database (48kHz)                                                                                              | SE                      | ENG                  | https://datashare.ed.ac.uk/handle/10283/2826                                                                 |              |
| vctk_noisyreverb        | Noisy reverberant speech database (48kHz)                                                                                        | SE                      | ENG                  | https://datashare.ed.ac.uk/handle/10283/2826                                                                 |              |
| vivos                   | VIVOS (Vietnamese corpus for ASR)                                                                                                | ASR                     | VIE                  | https://doi.org/10.5281/zenodo.7068130                                                                       |              |
| voxforge                | VoxForge                                                                                                                         | ASR                     | 7 languages          | http://www.voxforge.org/                                                                                     |              |
| wenetspeech             | WenetSpeech: A 10000+ Hours Multi-domain Chinese Corpus for Speech Recognition                                                   | ASR                     | CMN                  | https://wenet-e2e.github.io/WenetSpeech/                                                                     |              |
| wham                    | The WSJ0 Hipster Ambient Mixtures (WHAM!) dataset                                                                                | SE                      | ENG                  | https://wham.whisper.ai/                                                                                     |              |
| whamr                   | WHAMR!: Noisy and Reverberant Single-Channel Speech Separation                                                                   | SE                      | ENG                  | https://wham.whisper.ai/                                                                                     |              |
| wsj                     | CSR-I (WSJ0) Complete, CSR-II (WSJ1) Complete                                                                                    | ASR                     | ENG                  | https://catalog.ldc.upenn.edu/LDC93S6A,https://catalog.ldc.upenn.edu/LDC94S13A                               |              |
| wsj0_2mix               | MERL WSJ0-mix multi-speaker dataset                                                                                              | ASR/SE                  | ENG                  | http://www.merl.com/demos/deep-clustering                                                                    |              |
| wsj0_2mix_spatialized   | MERL WSJ0-mix multi-speaker dataset (Spatialized version)                                                                        | ASR/Multichannel ASR/SE | ENG                  | http://www.merl.com/demos/deep-clustering                                                                    |              |
| yesno                   | The "yesno" corpus                                                                                                               | ASR                     | HEB                  | http://www.openslr.org/1                                                                                     |              |
| yoloxochitl_mixtec      | Yoloxochitl-Mixtec corpus (endangered language in central Mexico)                                                                | ASR                     | XTY                  | http://www.openslr.org/89                                                                                    |              |
| zeroth_korean           | Zeroth-Korean                                                                                                                    | ASR                     | KOR                  | http://www.openslr.org/40                                                                                    |              |
| zh_openslr38            | ST-CMDS-20170001_1, Free ST Chinese Mandarin Corpus                                                                              | ASR                     | CMN                  | http://www.openslr.org/38                                                                                    |              |
