Datasets
Anna 1.0
License: CC0-1.0
Locale: hu-HU
Task: TTS
Format: WEBM
Size: 95.27 MB
Dave 1.0
License: CC0-1.0
Locale: es-ES
Task: TTS
Format: WEBM
Size: 85.24 MB
Kathleen 1.0
License: CC0-1.0
Locale: en-US
Task: TTS
Format: FLAC
Size: 211.96 MB
Joe 1.0
License: CC0-1.0
Locale: en-US
Task: TTS
Format: WEBM
Size: 75.78 MB
Kerstin 1.0
License: CC0-1.0
Locale: de-DE
Task: TTS
Format: WEBM
Size: 132.05 MB
ReRooted: Speech Corpus of Testimonials from Armenian Refugees and Immigrants
License: GPL-3.0
Locale: hy
Task: ASR
Format: WAV, TEXTGRID
Size: 3.25 GB
Kaleem Magazine Urdu Corpus
License: CC-BY-NC-4.0
Locale: urd
Task: NLP
Format: TXT
Size: 2.74 MB
Baloch Publishers Saraiki Literature Corpus
License: CC-BY-NC-4.0
Locale: skr
Task: NLP
Format: TXT
Size: 2.04 MB
Chishti Sons Punjabi Literature Corpus
License: CC-BY-NC-4.0
Locale: pnb
Task: NLP
Format: TXT
Size: 1.65 MB
FUB-Narratives
License: NOODL-1.0
Locale: fub
Task: NLP
Format: TXT
Size: 168.34 KB
Jazab Sindhi Newspaper Corpus
License: CC-BY-NC-SA-4.0
Locale: snd
Task: NLP
Format: TXT
Size: 2.33 MB
Tamir Sindhi News Corpus
License: CC-BY-NC-SA-4.0
Locale: snd
Task: NLP
Format: TXT
Size: 2.56 MB
Mediamen Punjabi Literature Corpus
License: CC-BY-NC-4.0
Locale: pnb
Task: NLP
Format: TXT
Size: 1.82 MB
Speech Corpus of Armenian Question-Answer Dialogues
License: GPL-3.0
Locale: hy
Task: ASR
Format: WAV, TEXTGRID, TXT
Size: 2.10 GB
Ewondo-French Parallel Corpus
License: NOODL-1.0
Locale: ewo, fr
Task: MT
Format: TSV
Size: 137.84 KB
Dimitar 1.0
License: CC0-1.0
Locale: bg-BG
Task: TTS
Format: WEBM
Size: 109.58 MB
Punjabi Literature Corpus
License: CC-BY-NC-4.0
Locale: pa-PK
Task: OTH
Format: TXT
Size: 1.83 MB
Saraiki Quarterly Magazine Wasson Wehray Corpus
License: CC-BY-NC-4.0
Locale: skr
Task: OTH
Format: TXT
Size: 2.09 MB
Urdu Literature Corpus
License: CC-BY-NC-4.0
Locale: ur
Task: OTH
Format: TXT
Size: 2.86 MB
Saraiki Literature Corpus
License: CC-BY-NC-4.0
Locale: skr
Task: OTH
Format: TXT
Size: 1.84 MB
Podcast Hari Minggoean (Indonesia)
License: CC-BY-SA-4.0
Locale: id-ID
Task: ASR
Format: mp3
Size: 338.92 MB
Tetelancingo Nahuatl
License: CC-BY-NC-4.0
Locale: nhi
Task: ASR
Format: .tsv, .wav
Size: 952.98 MB
Mozilla Common Voice Spontaneous Speech ASR Shared Task Train/Dev Data
License: CC0-1.0
Locale: mul
Task: ASR
Format: mp3
Size: 4.30 GB