DataTrust Africa: Speech Corpus of Public Radio Recordings from Northern Uganda

License icon

License:

NOODL-1.0

Shield icon

Steward:

Amara Hub

Task: NLP

Release Date: 1/15/2026

Format: MP3

Size: 179.82 MB


Share

Description

This is an open-access corpus of short clips of public radio content from Mega 100 FM, Q FM, Radio Pacis and Radio Rupiny in Northern Uganda. As of now, the online corpus has over 350 clips of recordings in English. We also hope to add finely-annotated transcripts to them. The dataset is for use in NLP research and non-commercial use. Upcoming datasets to look out for from Amara Hub are public radio recordings in other languages spoken in the region like Acholi, Lango, Lugbara and Akaramajong.

Specifics

Licensing

Nwulite Obodo Open Data Licence 1.0 (NOODL-1.0)

https://licensingafricandatasets.com/nwulite-obodo-license

Considerations

Restrictions/Special Constraints

Please contact us if you are interested in collaborating with DataTrust Africa: https://datatrust.africa/

Processes

Ethical Review

Ethical Data Advisory: https://drive.google.com/file/d/1NR9sF-1iplVCAV5ziygz45OdsHktV5a8/view

Intended Use

DataTrust Africa operates as a non-commercial, public-interest entity. We transform short, publicly available content into structured insights that support transparency and social good. Any access fees reflect cost recovery for processing and infrastructure, not the sale of broadcast content. Original audio remains the property of its respective broadcasters and is not redistributed.

Metadata