Yezoum_ALCAM-MultimodalDataset

License icon

License:

NOODL-1.0

Shield icon

Steward:

Institute of African Digital Humanities

Task: NLP

Release Date: 3/30/2026

Format: MP3, TSV

Size: 12.81 MB


Share

Description

Yezoum_ALCAM-MultimodalDataset is a richly curated, multimodal linguistic dataset dedicated to the documentation and technological enhancement of the Yezoum variety of the widely designated 'Ewondo language', or sometimes of the macro linguistic group known as Beti or Beti-Fang. Yezoum is a localised and socially embedded speech form that is rarely represented in standard grammatical descriptions or lexicographical resources. The dataset comprises three closely aligned components: (i) a structured datasheet containing carefully selected example sentences reflecting casual, albeit non-authentic, usage in the Yezoum variety; (ii) high-quality audio recordings of these sentences, produced by a native speaker; and (iii) an explicit audio–sentence mapping file enabling precise alignment between the textual and acoustic data. The dataset's primary added value lies in its explicit focus on the Yezoum variety, which, as many other 'satellite' varieties of Beti/Beti-Fang, typically remains invisible in reference grammars, dictionaries and educational materials that often privilege standardised or prestigious varieties such as Ewondo and Bulu. The dataset captures micro-variation in phonetics, phonology, morphosyntax and lexical choice, which are essential for understanding socially situated linguistic practices rather than a homogeneous, abstract system. In this sense, the dataset contributes to a more inclusive representation of linguistic diversity. From a methodological perspective, the dataset is designed to bridge the gap between language documentation and language technology. The parallel availability of text in the Yezoum variety and in French, alongside aligned speech, makes the dataset suitable for a wide range of applications, including automatic speech recognition (ASR), text-to-speech (TTS), machine translation (MT), forced alignment, pronunciation modelling and multimodal language learning tools. At the same time, the structured datasheet supports linguistic analysis, contrastive studies with other language varieties and pedagogical uses in teacher training and language revitalisation contexts. More broadly, the Yezoum_ALCAM-MultimodalDataset exemplifies an approach to African language resources that highlights fluidity, longitudinal variation, orality and community-based practice.

Specifics

Licensing

Nwulite Obodo Open Data Licence 1.0 (NOODL-1.0)

https://licensingafricandatasets.com/nwulite-obodo-license

Considerations

Restrictions/Special Constraints

By downloading this dataset, you agree: - To use it for research and scientific use only - that you will not re-host or re-share this dataset

Forbidden Usage

You agree not to use the data for: determining the identity of the speaker in the dataset; attempt to clone the voice or train models that imitate the speaker in this dataset; Generative AI; reproduction; duplication; modification; augmentation; copying; distribution; transmission; display; sale; transfer; publication or creation of derivative works without the explicit permission of the legal owner of the dataset.

Processes

Intended Use

(a) Speech-related tasks: - Automatic speech recognition (ASR): Audio–text alignment allows the evaluation of speech recognition models for Yezoum. However, it should be noted that the read sentences are transcribed phonetically. There is at least one competing orthographic standard for Ewondo, the designated language of which Yezoum is considered a dialect in standard classification; the General Alphabet of Cameroon's Languages is the one that is closest to phonetic transcription. The other is the Catholic Missionaries orthography inspired by the model laid out by François Pichon in 1950. - Text-to-speech (TTS): As the dataset contains clean sentence–audio pairs, it can also be used to evaluate speech synthesis or text-to-speech models. Here again, it should be noted that the alphabet used to write the sentences is the IPA alphabet and not the General Alphabet of Cameroon's Languages, the Protestant alphabet or the Catholic alphabet. - Speech–text alignment/forced alignment benchmarking: Fine-grained, word-level segmentation provides ideal ground truth for evaluating phoneme- or word-level aligners. (b) Translation and multilingual tasks: - Machine translation (Yezoum (considered as a "dialect" of Ewondo) ↔ French): The sentence-level alignment between Yezoum/Ewondo and French makes it a parallel corpus for evaluating translation models with the limitations of the employed phonetic orthographic standard. - Speech translation (speech-to-text) (c) Linguistic and lexicographic tasks: - Morphological analysis/glossed corpus studies: The morpheme-level glosses are valuable for computational morphology, interlinear text modelling (ILTs) and grammar induction tasks. - Lexicon and part-of-speech tagging: These are useful for building linguistic resources such as dictionaries, morphological analysers or taggers for Yezoum/Ewondo.

Metadata

Language

Yezoum (also written 'Yezum') is considered a variety of the Beti macro-linguistic group belonging to the Narrow Bantu family. They are located primarily in the Centre Region of Cameroon, in the Upper-Nyong Division (Haut-Nyong) and in different municipalities including Nanga-Eboko, Lembe-Yezoum.

Variants

At the time of publication of this dataset, we do not have a precise idea of the scope of variation of Yezoum, a language or linguistic variety which is considered to be itself a component of a supra linguistic entity, either Ewondo or the larger Beti/Beti-Fang group.

Writing System

The writing system used for the transcription of Yezoum in this dataset is the International Phonetic Alphabet (IPA), as reflected in lexical entries (Word) and sentence-level examples (LangEx) in the datasheet. The phonological inventory described below is derived directly from the attested forms in the LangEx and Word columns of the datasheet.

1. Vowels

The vowel system attested in the dataset is as follows:

i, e, ɛ, a, ɔ, o, u, ə

These vowels occur both with and without tone marking in lexical items and running text (e.g. mə̀ndíp 'water', àɲù 'mouth', ǹló 'head', ngə́m 'tail').

2. Consonants

The consonant inventory reflected in the dataset includes the following simple, prenasalized and affricate consonants:

b, d, dz, dʒ, f, g, h, k, l, m, mb, mv, n, nd, ng, ŋ, ŋg, ŋk, p, r, s, t, ts, v, w, y, z, ɲ, ʒ

These consonants appear consistently across noun stems, verbal forms, derivational patterns, and noun-class alternations (e.g. ǹló 'head', ngə́m 'tail', dʒóé 'nose', àɲù 'mouth', mə̀ndíp 'water').

3. Tone system

The datasheet shows lexical and grammatical contrastive tones, marked directly on vowels and on the sonorants m and n. The following tonal categories are attested in the LangEx column:

  • High tone (H): á, é, ɛ́, í, ó, ɔ́, ú, ə́, ń, ḿ

  • Low tone (L): à, è, ɛ̀, ì, ò, ɔ̀, ù, ə̀, ǹ, m̀

  • Falling contour tone (HL): â, ê, î, ô, ɔ̂, û, ə̂, ɛ̂

  • Rising contour tone (LH): ǎ, ě, ǐ, ǒ, ɔ̌, ǔ, ə̌, ɛ̌

  • Falling-low contour tone: attested on consonant-final syllables in a small set of lexical items

Unmarked vowels represent tonally neutral or contextually determined syllables.

Source

The dataset was collected through a questionnaire designed to gather basic information about the Yezoum lexicon and grammar. This was done as part of the Atlas Linguistique du Cameroun (ALCAM) project.

Domain

The dataset represents a linguistic questionnaire designed to elicit the basic lexicon and grammatical information.

Size

Total size is approximately 13.25 MB (uncompressed).

Structure

The dataset comprises: 1) a datasheet (ALCAM_dataset_Yezoum.tsv) with 399 lines and 20 columns; 2) 387 voice clips read by a single native speaker; 3) a sentence-to-audio mapping file (audio_mapping_MP3.tsv) with 387 lines and 3 columns.

Description of columns
  • #OrigID: original number of lexical entry on paper questionnaire

  • #EditID: modification of #OrigID

  • #FrenchRef: reference entry (originally provided in French)

  • #FrenchComm: original comments about reference entry (#FrenchRef)

  • #French: lexical entry in French (overlaps with #FrenchRef)

  • #Note: note of researcher on the lexical entry

  • #POS: part of speech

  • #Class: noun class (where applicable)

  • #Morf: morphological attribute (e.g. plural, singular)

  • #Var: (na)

  • #Word: lexical entry in Yezoum

  • #CrossRef: cross-referencing of lexical entry number

  • #FrenchEx: example sentence in French

  • #LangEx: example sentence in Yezoum

  • #LangExEdit: manual editing of #LangEx

  • #FrenchExEdit: edited French equivalent of #FrenchEx

  • #LangPars: word-for-word parsing in Yezoum

  • #LangParsEdit: editing of #LangPars

  • #FrenchPars: French equivalent of #LangParsEdit

  • #FrenchParsEdit: editing of #FrenchPars

Sample

audio fileswords & sentences
e7d9294a350c225fa2da5733960530ed.mp3àɲù; à bèlé mǎn áɲù
89f430261e472d80c9f1e5f8ee55fe55.mp3mə̀ɲù; á ně: àbʷǎt
ebb100d39c8c2e0d41d8443f2df487ce.mp3dís; mà kɔ̀m mís
9b2d98deb63ca6b5daaeaa31765a5a94.mp3ǹlō; à bèlé ǹló ónɛ̄ à nɛ̂n è kʸiŋ è tòá áyàp
a7f16551395d088a23a41e8d5d8cc8db.mp3mə̀vùl; mə̀vùl má vîn
fe51ef3e2c670fae8e92dd46d20f6ece.mp3àsòŋ; mə́ nə̀ mə̀sòŋ mə́ mvú
1c40db0bcf65f8dae04ec78ab2f60597.mp3mə̀sòŋ
5bb56f5fc4958a442236e5a12a0b9b4e.mp3òyém; à díkì òyêm
0ac63f1ddeb51efe291ba56cd17219b4.mp3ǹdzóé; à yà ndzóé
ef0ac99a6c387ab4ae93cdcd0085e6bf.mp3kíŋ; kíŋ dzìè énə̀ à nɛ́n