Datasets
INEL Nganasan Speech Corpus
License: CC-BY-NC-SA-4.0
Locale: nio
Task: ASR
Format: TSV, MP3
Size: 1.29 GB
INEL Evenki Speech Corpus
License: CC-BY-NC-SA-4.0
Locale: evn
Task: ASR
Format: TSV, MP3
Size: 103.03 MB
INEL Dolgan Speech Corpus
License: CC-BY-NC-SA-4.0
Locale: dlg
Task: ASR
Format: TSV, MP3
Size: 583.34 MB
INEL Kamas Speech Corpus
License: CC-BY-NC-SA-4.0
Locale: xas
Task: ASR
Format: TSV, MP3
Size: 376.64 MB
INEL Selkup Speech Corpus
License: CC-BY-NC-SA-4.0
Locale: sel
Task: ASR
Format: TSV, MP3
Size: 45.46 MB
INEL Enets Speech Corpus
License: CC-BY-NC-SA-4.0
Locale: enf, enh
Task: ASR
Format: TSV, MP3
Size: 140.56 MB
INEL Nenets Speech Corpus
License: CC-BY-NC-SA-4.0
Locale: yrk
Task: ASR
Format: TSV, MP3
Size: 8.35 MB
Common Voice Scripted Speech 25.0 - Bengali
License: CC0-1.0
Locale: bn
Task: ASR
Format: MP3
Size: 24.84 GB
Common Voice Scripted Speech 25.0 - Chinese (China)
License: CC0-1.0
Locale: zh-CN
Task: ASR
Format: MP3
Size: 21.38 GB
English Hausa Parallel Corpus
License: CC-BY-NC-4.0
Locale: eng, hau
Task: MT
Format: csv
Size: 164.32 KB
Persian Literature Corpus by Najwai Sukhan
License: CC-BY-NC-4.0
Locale: fas
Task: NLP
Format: TXT
Size: 38.62 MB
Heroes English-Spanish Dubbed Movie Speech Corpus
License: CC-BY-SA-4.0
Locale: eng, spa
Task: NLP
Format: wav, csv, txt
Size: 1.68 GB
Common Voice Scripted Speech 25.0 - Swahili
License: CC0-1.0
Locale: sw
Task: ASR
Format: MP3
Size: 20.87 GB
Common Voice Scripted Speech 25.0 - Kabyle
License: CC0-1.0
Locale: kab
Task: ASR
Format: MP3
Size: 17.43 GB
Common Voice Scripted Speech 25.0 - Basque
License: CC0-1.0
Locale: eu
Task: ASR
Format: MP3
Size: 14.48 GB
Common Voice Scripted Speech 25.0 - Japanese
License: CC0-1.0
Locale: ja
Task: ASR
Format: MP3
Size: 14.34 GB
Common Voice Scripted Speech 25.0 - Luganda
License: CC0-1.0
Locale: lg
Task: ASR
Format: MP3
Size: 11.06 GB
Common Voice Scripted Speech 25.0 - Czech
License: CC0-1.0
Locale: cs
Task: ASR
Format: MP3
Size: 5.56 GB
Common Voice Scripted Speech 25.0 - Urdu
License: CC0-1.0
Locale: ur
Task: ASR
Format: MP3
Size: 5.78 GB
Common Voice Scripted Speech 25.0 - Georgian
License: CC0-1.0
Locale: ka
Task: ASR
Format: MP3
Size: 6.37 GB
Common Voice Scripted Speech 25.0 - Thai
License: CC0-1.0
Locale: th
Task: ASR
Format: MP3
Size: 8.38 GB
Common Voice Scripted Speech 25.0 - Russian
License: CC0-1.0
Locale: ru
Task: ASR
Format: MP3
Size: 6.55 GB
Common Voice Scripted Speech 25.0 - Italian
License: CC0-1.0
Locale: it
Task: ASR
Format: MP3
Size: 9.71 GB
Common Voice Scripted Speech 25.0 - Galician
License: CC0-1.0
Locale: gl
Task: ASR
Format: MP3
Size: 7.81 GB