Yezoum_ALCAM-MultimodalDataset
License:
NOODL-1.0
Steward:
Institute of African Digital HumanitiesTask: NLP
Release Date: 3/30/2026
Format: MP3, TSV
Size: 12.81 MB
Share
Description
Yezoum_ALCAM-MultimodalDataset is a richly curated, multimodal linguistic dataset dedicated to the documentation and technological enhancement of the Yezoum variety of the widely designated 'Ewondo language', or sometimes of the macro linguistic group known as Beti or Beti-Fang. Yezoum is a localised and socially embedded speech form that is rarely represented in standard grammatical descriptions or lexicographical resources. The dataset comprises three closely aligned components: (i) a structured datasheet containing carefully selected example sentences reflecting casual, albeit non-authentic, usage in the Yezoum variety; (ii) high-quality audio recordings of these sentences, produced by a native speaker; and (iii) an explicit audio–sentence mapping file enabling precise alignment between the textual and acoustic data. The dataset's primary added value lies in its explicit focus on the Yezoum variety, which, as many other 'satellite' varieties of Beti/Beti-Fang, typically remains invisible in reference grammars, dictionaries and educational materials that often privilege standardised or prestigious varieties such as Ewondo and Bulu. The dataset captures micro-variation in phonetics, phonology, morphosyntax and lexical choice, which are essential for understanding socially situated linguistic practices rather than a homogeneous, abstract system. In this sense, the dataset contributes to a more inclusive representation of linguistic diversity. From a methodological perspective, the dataset is designed to bridge the gap between language documentation and language technology. The parallel availability of text in the Yezoum variety and in French, alongside aligned speech, makes the dataset suitable for a wide range of applications, including automatic speech recognition (ASR), text-to-speech (TTS), machine translation (MT), forced alignment, pronunciation modelling and multimodal language learning tools. At the same time, the structured datasheet supports linguistic analysis, contrastive studies with other language varieties and pedagogical uses in teacher training and language revitalisation contexts. More broadly, the Yezoum_ALCAM-MultimodalDataset exemplifies an approach to African language resources that highlights fluidity, longitudinal variation, orality and community-based practice.
Specifics
Licensing
Nwulite Obodo Open Data Licence 1.0 (NOODL-1.0)
https://licensingafricandatasets.com/nwulite-obodo-licenseConsiderations
Restrictions/Special Constraints
By downloading this dataset, you agree: - To use it for research and scientific use only - that you will not re-host or re-share this dataset
Forbidden Usage
You agree not to use the data for: determining the identity of the speaker in the dataset; attempt to clone the voice or train models that imitate the speaker in this dataset; Generative AI; reproduction; duplication; modification; augmentation; copying; distribution; transmission; display; sale; transfer; publication or creation of derivative works without the explicit permission of the legal owner of the dataset.
Processes
Intended Use
(a) Speech-related tasks: - Automatic speech recognition (ASR): Audio–text alignment allows the evaluation of speech recognition models for Yezoum. However, it should be noted that the read sentences are transcribed phonetically. There is at least one competing orthographic standard for Ewondo, the designated language of which Yezoum is considered a dialect in standard classification; the General Alphabet of Cameroon's Languages is the one that is closest to phonetic transcription. The other is the Catholic Missionaries orthography inspired by the model laid out by François Pichon in 1950. - Text-to-speech (TTS): As the dataset contains clean sentence–audio pairs, it can also be used to evaluate speech synthesis or text-to-speech models. Here again, it should be noted that the alphabet used to write the sentences is the IPA alphabet and not the General Alphabet of Cameroon's Languages, the Protestant alphabet or the Catholic alphabet. - Speech–text alignment/forced alignment benchmarking: Fine-grained, word-level segmentation provides ideal ground truth for evaluating phoneme- or word-level aligners. (b) Translation and multilingual tasks: - Machine translation (Yezoum (considered as a "dialect" of Ewondo) ↔ French): The sentence-level alignment between Yezoum/Ewondo and French makes it a parallel corpus for evaluating translation models with the limitations of the employed phonetic orthographic standard. - Speech translation (speech-to-text) (c) Linguistic and lexicographic tasks: - Morphological analysis/glossed corpus studies: The morpheme-level glosses are valuable for computational morphology, interlinear text modelling (ILTs) and grammar induction tasks. - Lexicon and part-of-speech tagging: These are useful for building linguistic resources such as dictionaries, morphological analysers or taggers for Yezoum/Ewondo.
Metadata
Language
Yezoum (also written 'Yezum') is considered a variety of the Beti macro-linguistic group belonging to the Narrow Bantu family. They are located primarily in the Centre Region of Cameroon, in the Upper-Nyong Division (Haut-Nyong) and in different municipalities including Nanga-Eboko, Lembe-Yezoum.
Variants
At the time of publication of this dataset, we do not have a precise idea of the scope of variation of Yezoum, a language or linguistic variety which is considered to be itself a component of a supra linguistic entity, either Ewondo or the larger Beti/Beti-Fang group.
Writing System
The writing system used for the transcription of Yezoum in this dataset is the International Phonetic Alphabet (IPA), as reflected in lexical entries (Word) and sentence-level examples (LangEx) in the datasheet. The phonological inventory described below is derived directly from the attested forms in the LangEx and Word columns of the datasheet.
1. Vowels
The vowel system attested in the dataset is as follows:
i, e, ɛ, a, ɔ, o, u, ə
These vowels occur both with and without tone marking in lexical items and running text (e.g. mə̀ndíp 'water', àɲù 'mouth', ǹló 'head', ngə́m 'tail').
2. Consonants
The consonant inventory reflected in the dataset includes the following simple, prenasalized and affricate consonants:
b, d, dz, dʒ, f, g, h, k, l, m, mb, mv, n, nd, ng, ŋ, ŋg, ŋk, p, r, s, t, ts, v, w, y, z, ɲ, ʒ
These consonants appear consistently across noun stems, verbal forms, derivational patterns, and noun-class alternations (e.g. ǹló 'head', ngə́m 'tail', dʒóé 'nose', àɲù 'mouth', mə̀ndíp 'water').
3. Tone system
The datasheet shows lexical and grammatical contrastive tones, marked directly on vowels and on the sonorants m and n. The following tonal categories are attested in the LangEx column:
High tone (H): á, é, ɛ́, í, ó, ɔ́, ú, ə́, ń, ḿ
Low tone (L): à, è, ɛ̀, ì, ò, ɔ̀, ù, ə̀, ǹ, m̀
Falling contour tone (HL): â, ê, î, ô, ɔ̂, û, ə̂, ɛ̂
Rising contour tone (LH): ǎ, ě, ǐ, ǒ, ɔ̌, ǔ, ə̌, ɛ̌
Falling-low contour tone: attested on consonant-final syllables in a small set of lexical items
Unmarked vowels represent tonally neutral or contextually determined syllables.
Source
The dataset was collected through a questionnaire designed to gather basic information about the Yezoum lexicon and grammar. This was done as part of the Atlas Linguistique du Cameroun (ALCAM) project.
Domain
The dataset represents a linguistic questionnaire designed to elicit the basic lexicon and grammatical information.
Size
Total size is approximately 13.25 MB (uncompressed).
Structure
The dataset comprises: 1) a datasheet (ALCAM_dataset_Yezoum.tsv) with 399 lines and 20 columns; 2) 387 voice clips read by a single native speaker; 3) a sentence-to-audio mapping file (audio_mapping_MP3.tsv) with 387 lines and 3 columns.
Description of columns
#OrigID: original number of lexical entry on paper questionnaire
#EditID: modification of #OrigID
#FrenchRef: reference entry (originally provided in French)
#FrenchComm: original comments about reference entry (#FrenchRef)
#French: lexical entry in French (overlaps with #FrenchRef)
#Note: note of researcher on the lexical entry
#POS: part of speech
#Class: noun class (where applicable)
#Morf: morphological attribute (e.g. plural, singular)
#Var: (na)
#Word: lexical entry in Yezoum
#CrossRef: cross-referencing of lexical entry number
#FrenchEx: example sentence in French
#LangEx: example sentence in Yezoum
#LangExEdit: manual editing of #LangEx
#FrenchExEdit: edited French equivalent of #FrenchEx
#LangPars: word-for-word parsing in Yezoum
#LangParsEdit: editing of #LangPars
#FrenchPars: French equivalent of #LangParsEdit
#FrenchParsEdit: editing of #FrenchPars
Sample
| audio files | words & sentences |
|---|---|
| e7d9294a350c225fa2da5733960530ed.mp3 | àɲù; à bèlé mǎn áɲù |
| 89f430261e472d80c9f1e5f8ee55fe55.mp3 | mə̀ɲù; á ně: àbʷǎt |
| ebb100d39c8c2e0d41d8443f2df487ce.mp3 | dís; mà kɔ̀m mís |
| 9b2d98deb63ca6b5daaeaa31765a5a94.mp3 | ǹlō; à bèlé ǹló ónɛ̄ à nɛ̂n è kʸiŋ è tòá áyàp |
| a7f16551395d088a23a41e8d5d8cc8db.mp3 | mə̀vùl; mə̀vùl má vîn |
| fe51ef3e2c670fae8e92dd46d20f6ece.mp3 | àsòŋ; mə́ nə̀ mə̀sòŋ mə́ mvú |
| 1c40db0bcf65f8dae04ec78ab2f60597.mp3 | mə̀sòŋ |
| 5bb56f5fc4958a442236e5a12a0b9b4e.mp3 | òyém; à díkì òyêm |
| 0ac63f1ddeb51efe291ba56cd17219b4.mp3 | ǹdzóé; à yà ndzóé |
| ef0ac99a6c387ab4ae93cdcd0085e6bf.mp3 | kíŋ; kíŋ dzìè énə̀ à nɛ́n |