DataTrust Africa: Speech Corpus of Public Radio Recordings from Northern Uganda
License:
NOODL-1.0
Steward:
Amara HubTask: NLP
Release Date: 1/15/2026
Format: MP3
Size: 179.82 MB
Share
Description
This is an open-access corpus of short clips of public radio content from Mega 100 FM, Q FM, Radio Pacis and Radio Rupiny in Northern Uganda. As of now, the online corpus has over 350 clips of recordings in English. We also hope to add finely-annotated transcripts to them. The dataset is for use in NLP research and non-commercial use. Upcoming datasets to look out for from Amara Hub are public radio recordings in other languages spoken in the region like Acholi, Lango, Lugbara and Akaramajong.
Specifics
Licensing
Nwulite Obodo Open Data Licence 1.0 (NOODL-1.0)
https://licensingafricandatasets.com/nwulite-obodo-licenseConsiderations
Restrictions/Special Constraints
Please contact us if you are interested in collaborating with DataTrust Africa: https://datatrust.africa/
Processes
Ethical Review
Ethical Data Advisory: https://drive.google.com/file/d/1NR9sF-1iplVCAV5ziygz45OdsHktV5a8/view
Intended Use
DataTrust Africa operates as a non-commercial, public-interest entity. We transform short, publicly available content into structured insights that support transparency and social good. Any access fees reflect cost recovery for processing and infrastructure, not the sale of broadcast content. Original audio remains the property of its respective broadcasters and is not redistributed.