Datasets
Filters:
Institute of African Digital Humanities
Ewondo-French Parallel Corpus
License: NOODL-1.0
Locale: ewo, fr
Task: MT
Format: TSV
Size: 137.84 KB
Open Home Foundation
Dimitar 1.0
License: CC0-1.0
Locale: bg-BG
Task: TTS
Format: WEBM
Size: 109.58 MB
Tamahi Suneha Magazine
Punjabi Literature Corpus
License: CC-BY-NC-4.0
Locale: pa-PK
Task: OTH
Format: TXT
Size: 1.83 MB
Sujaak Adbi Sangat
Saraiki Quarterly Magazine Wasson Wehray Corpus
License: CC-BY-NC-4.0
Locale: skr
Task: OTH
Format: TXT
Size: 2.09 MB
Bismillah Graphics Publishers
Urdu Literature Corpus
License: CC-BY-NC-4.0
Locale: ur
Task: OTH
Format: TXT
Size: 2.86 MB
Kaleem Art Press
Saraiki Literature Corpus
License: CC-BY-NC-4.0
Locale: skr
Task: OTH
Format: TXT
Size: 1.84 MB
Community
Podcast Hari Minggoean (Indonesia)
License: CC-BY-SA-4.0
Locale: id-ID
Task: ASR
Format: mp3
Size: 338.92 MB
Kaltepetlahtol
Tetelancingo Nahuatl
License: CC-BY-NC-4.0
Locale: nhi
Task: ASR
Format: .tsv, .wav
Size: 952.98 MB
Common Voice
Mozilla Common Voice Spontaneous Speech ASR Shared Task Train/Dev Data
License: CC0-1.0
Locale: mul
Task: ASR
Format: mp3
Size: 4.30 GB