Datasets
Effect AI Scripted Speech 1.0 - English
License: CC0-1.0
Locale: en
Task: TTS
Format: CSV, MP3
Size: 663.45 MB
DataTrust Africa: Speech Corpus of Public Radio Recordings from Northern Uganda
License: NOODL-1.0
Locale: en-US
Task: NLP
Format: MP3
Size: 179.82 MB
Khmer ASR Cultural Dataset
License: CC-BY-SA-4.0
Locale: khm
Task: ASR
Format: WAV
Size: 12.59 GB
Corpus of Panjebar Semangat Javanese-Language Magazine
License: CC-BY-SA-4.0
Locale: Jav
Task: OTH
Format: TXT
Size: 4.31 MB
SI-NLI
License: CC-BY-NC-SA-4.0
Locale: sl
Task: NLU
Format: TSV
Size: 392.44 KB
Vallader Newspaper Corpus
License: CC0-1.0
Locale: rm-vallader
Task: OTH
Format: TSV
Size: 18.71 MB
Multilingual Religious Parallel Corpus (Kaleem Art Press)
License: CC-BY-SA-4.0
Locale: mul
Task: MT
Format: CSV
Size: 2.27 MB
Sindh Line Publishers
License: CC-BY-SA-4.0
Locale: snd
Task: NLP
Format: TXT
Size: 2.22 MB
Spoken-Congolese-French-Dataset
License: NOODL-1.0
Locale: fr-CG
Task: NLP
Format: MP3, WAV, TSV
Size: 3.44 GB
Ewondo_Mbida-Mbani_ALCAM-MultimodalDataset
License: NOODL-1.0
Locale: ewo
Task: NLP
Format: MP3, TSV
Size: 19.25 MB
Balochi Academy Text Corpus
License: CC-BY-NC-SA-4.0
Locale: bgn
Task: NLP
Format: TXT
Size: 1.88 MB
Mada Narratives
License: NOODL-1.0
Locale: mxu
Task: NLP
Format: TXT
Size: 65.04 KB
Surmiran Newspaper Corpus
License: CC0-1.0
Locale: rm-surmiran
Task: OTH
Format: TSV
Size: 11.89 MB
DhoNam: Dholuo Speech dataset
License: NOODL-1.0
Locale: Luo
Task: ASR
Format: WEBM
Size: 2.49 GB
Archivo de la Comisionada María de los Ángeles Guzmán García (COTAI Nuevo León / InfoNL)
License: CC-BY-4.0
Locale: es-MX
Task: NLP
Format: ZIP, PDF, CSV, XLSX
Size: 866.15 MB
Improving AI Conflict Resolution Capacities: A Prompts-Based Evaluation
License: CC-BY-4.0
Locale: mul
Task: NLP
Format: CSV, PDF
Size: 1.46 MB
AI on the Frontline: Evaluating Large Language Models in Real-World Conflict Resolution
License: CC-BY-4.0
Locale: en-US
Task: NLP
Format: CSV, PDF
Size: 2.36 MB
Teke-Laali-TTS-Dataset
License: NOODL-1.0
Locale: lli
Task: TTS
Format: WAV, TSV
Size: 635.61 MB
Bati-MultiDialectalASR-Dataset
License: NOODL-1.0
Locale: btc
Task: ASR
Format: WAV, TSV
Size: 3.27 GB
Mozilla Common Voice Text Language Identification dataset
License: CC0-1.0
Locale: mul
Task: NLP
Format: TSV
Size: 950.41 MB
Central Kurdish TTS dataset 1.0
License: CC-BY-4.0
Locale: ckb
Task: TTS
Format: wav
Size: 293.45 MB
Laari-TTS-Dataset
License: NOODL-1.0
Locale: ldi
Task: ASR
Format: WAV, TRJS, TSV
Size: 568.26 MB
Bomitaba-TTS-Dataset
License: NOODL-1.0
Locale: zmx
Task: TTS
Format: WAV, TSV
Size: 1.00 GB
Suundi-TTS-Dataset
License: NOODL-1.0
Locale: sdj
Task: TTS
Format: WAV, TSV
Size: 240.50 MB