Datasets
Bojonegoro Javanese TTS
License: CC-BY-SA-4.0
Locale: jav
Task: TTS
Format: .tar.gz, WEBM
Size: 469.50 MB
Common Voice Spontaneous Speech 2.0 - Thur
License: CC0-1.0
Locale: lth
Task: ASR
Format: MP3
Size: 292.98 MB
Common Voice Scripted Speech 24.0 - Greek
License: CC0-1.0
Locale: el
Task: ASR
Format: MP3
Size: 741.82 MB
Tugão 1.0
License: CC0-1.0
Locale: pt-PT
Task: TTS
Format: WEBM
Size: 61.84 MB
DataTrust Africa: Speech Corpus of Public Radio Recordings from Northern Uganda
License: NOODL-1.0
Locale: en-US
Task: NLP
Format: MP3
Size: 179.82 MB
Documenting Ekpeye Folktales and Preserving Cultural Heritage
License: CC-BY-NC-SA-4.0
Locale: ekp
Task: OTH
Format: MP4, TXT, DOCX
Size: 5.97 GB
Podcast Hari Minggoean (Indonesia)
License: CC-BY-SA-4.0
Locale: id-ID
Task: ASR
Format: mp3
Size: 338.92 MB
Multilingual Humanitarian Response Eval (MHRE)
License: CC-BY-NC-SA-4.0
Locale: mul
Task: LLM
Format: csv
Size: 2.15 MB
Anna 1.0
License: CC0-1.0
Locale: hu-HU
Task: TTS
Format: WEBM
Size: 95.27 MB
Akoose-ALCAM-MultimodalDataset
License: NOODL-1.0
Locale: bss
Task: NLP
Format: MP3, TSV
Size: 16.05 MB
smoltalk-chinese
License: Apache-2.0
Locale: zh
Task: LLM
Format: parquet
Size: 879.81 MB
Lili 1.0
License: CC0-1.0
Locale: sk-SK
Task: TTS
Format: WEBM
Size: 72.38 MB
Chishti Sons Punjabi Literature Corpus
License: CC-BY-NC-4.0
Locale: pnb
Task: NLP
Format: TXT
Size: 1.65 MB
Common Voice Spontaneous Speech 2.0 - Eastern Min
License: CC0-1.0
Locale: cdo
Task: ASR
Format: MP3
Size: 190.61 MB
Common Voice Scripted Speech 24.0 - Mokpwe
License: CC0-1.0
Locale: bri
Task: ASR
Format: MP3
Size: 188.52 MB
Common Voice Scripted Speech 24.0 - Ngomba
License: CC0-1.0
Locale: jgo
Task: ASR
Format: MP3
Size: 216.91 MB
Common Voice Scripted Speech 24.0 - Fang
License: CC0-1.0
Locale: fan
Task: ASR
Format: MP3
Size: 235.94 MB
Common Voice Scripted Speech 24.0 - Ebrie
License: CC0-1.0
Locale: ebr
Task: ASR
Format: MP3
Size: 61.92 MB
Darkman 1.0
License: CC0-1.0
Locale: pl-PL
Task: TTS
Format: WEBM
Size: 40.42 MB
Chitwan 1.0
License: CC0-1.0
Locale: ne-NE
Task: TTS
Format: WEBM
Size: 61.68 MB
Common Voice Scripted Speech 24.0 - Slovak
License: CC0-1.0
Locale: sk
Task: ASR
Format: MP3
Size: 1.08 GB
Common Voice Scripted Speech 24.0 - Czech
License: CC0-1.0
Locale: cs
Task: ASR
Format: MP3
Size: 5.54 GB
Talar (تلار) Barahui Magazine Corpus
License: CC-BY-NC-SA-4.0
Locale: brh
Task: NLP
Format: TXT
Size: 317.22 KB
Common Voice Scripted Speech 24.0 - Basaa
License: CC0-1.0
Locale: bas
Task: ASR
Format: MP3
Size: 242.77 MB