Ewondo_Fong_ALCAM-MultimodalDataset

License:

NOODL-1.0

Steward:

Institute of African Digital Humanities

Task: NLP

Release Date: 1/20/2026

Format: MP3, TSV

Size: 16.80 MB

Description

Ewondo_Fong_ALCAM-MultimodalDataset is a richly curated, multimodal linguistic dataset dedicated to the documentation and technological enhancement of the Fong variety of the widely designated 'Ewondo language'. Fong is a localised and socially embedded speech form that is rarely represented in standard grammatical descriptions or lexicographical resources. The dataset comprises three closely aligned components: (i) a structured datasheet containing carefully selected example sentences reflecting casual, albeit non-authentic, usage in the Fong variety; (ii) high-quality audio recordings of these sentences, produced by a native speaker; and (iii) an explicit audio–sentence mapping file enabling precise alignment between the textual and acoustic data. The dataset's primary added value lies in its explicit focus on the Fong variety of Ewondo, which typically remains invisible in reference grammars, dictionaries and educational materials that often privilege standardised or prestigious varieties. The dataset captures micro-variation in phonetics, phonology, morphosyntax and lexical choice, which are essential for understanding socially situated linguistic practices rather than a homogeneous, abstract system. In this sense, the dataset contributes to a more inclusive representation of linguistic diversity. From a methodological perspective, the dataset is designed to bridge the gap between language documentation and language technology. The parallel availability of text in the Fong variety and in French, alongside aligned speech, makes the dataset suitable for a wide range of applications, including automatic speech recognition (ASR), text-to-speech (TTS), machine translation (MT), forced alignment, pronunciation modelling and multimodal language learning tools. At the same time, the structured datasheet supports linguistic analysis, contrastive studies with other language varieties and pedagogical uses in teacher training and language revitalisation contexts. More broadly, the Ewondo_Fong_ALCAM-MultimodalDataset exemplifies an approach to African language resources that highlights fluidity, longitudinal variation, orality and community-based practice.

Specifics

Licensing

Nwulite Obodo Open Data Licence 1.0 (NOODL-1.0)

https://licensingafricandatasets.com/nwulite-obodo-license

Considerations

Restrictions/Special Constraints

By downloading this dataset, you agree: - To use it for research, scientific and pedagogical use only - that you will not re-host or re-share this dataset

Forbidden Usage

You agree not to use the data for: determining the identity of the speaker in the dataset; attempt to clone the voice or train models that imitate the speaker in this dataset; Generative AI; reproduction; duplication; modification; augmentation; copying; distribution; transmission; display; sale; transfer; publication or creation of derivative works without the explicit permission of the the legal owner of the dataset.

Processes

Intended Use

(a) Speech-related tasks: - Automatic speech recognition (ASR): Audio–text alignment allows the evaluation of speech recognition models for Ewondo. However, I should be noted that the read sentences are transcribed phonetically. There is at least one competing orthographic standard for Ewondo; the General Alphabet of Cameroon's Languages is the one that is closest to phonetic transcription. The other is the Catholic Missionaries orthography inspired by the model laid out by François Pichon in 1950. - Text-to-speech (TTS): As the dataset contains clean sentence–audio pairs, it can also be used to evaluate speech synthesis or text-to-speech models. Here again, it should be noted that the alphabet used to write the sentences is the IPA alphabet and not the General Alphabet of Cameroon's Languages, the Protestant alphabet or the Catholic alphabet. - Speech–text alignment/forced alignment benchmarking: Fine-grained, word-level segmentation provides ideal ground truth for evaluating phoneme - or word-level aligners. | (b) Translation and multilingual tasks: - Machine translation (Ewondo ↔ French): The sentence-level alignment between Ewondo and French makes it a parallel corpus for evaluating translation models with the limitations of the employed phonetic orthographic standard. - Speech translation (speech-to-text): (c) Linguistic and lexicographic tasks - Morphological analysis/glossed corpus studies: The morpheme-level glosses are valuable for computational morphology, interlinear text modelling (ILTs) and grammar induction tasks. - Lexicon and part-of-speech tagging: These are useful for building linguistic resources such as dictionaries, morphological analysers or taggers for Ewondo. It is hoped that this dataset will serve pedagogical uses in teacher training, language learning and language revitalisation contexts.

Metadata

Language

Fong is considered as a dialect of Ewondo a Narrow Bantu language which is indigenous to a population mainly located in the Centre Region of Cameroon, with pockets of settlements in the South, and East Regions. Ewondo is vehicular to populations in the South and East Regions of Cameroon, and has also developed into a creole known as Mongo Ewondo;

Variants

The Fong are a small group scattered across several settlements in the Centre and South regions of present-day Cameroon. In the southern region, they are located in and around Mvoutessi in the Zoetele sub-division of the Dja-and-Lobo division, and in the villages of Ngoazip 1 and Ngoazip 2 in the Biwong-Bane sub-division of the Mvilla division. Social media posts also suggest that the Fong population may be found in the Mefou-and-Afamba Division in the Centre Region, around an area known as Nkoabang, as well as in Ngoulemekong in the Nyong-and-Mfoumou Division in the Centre Region of Cameroon.

Writing System

The writing system used for the transcription of the Fong variety in this dataset is the International Phonetic Alphabet (IPA), as reflected in lexical entries (Word) and sentence-level examples (LangEx) in the datasheet.

1. Vowels

The vowel system attested in the dataset is as follows:

i, e, ɛ, a, ɔ, o, u, ə

These vowels occur both with and without tone marking in lexical items and running text (e.g. mə̀ndím ‘water’, àlú ‘night’, èbɔ̀g ‘hip’, ngə́m ‘tail’).

2. Consonants

The consonant inventory reflected in the dataset includes the following simple and prenasalized consonants:

b, d, dz, f, g, ɣ, h, k, l, m, mb, mf, mv, n, nd, ndz, ɲ, ŋ, ŋg, ŋk, p, r, s, t, ts, v, w, y, z

These consonants appear consistently across noun stems, verbal forms, derivational patterns, and noun-class alternations (e.g. ǹló ‘head’, ŋgə́m ‘tail’, dzóé ‘name’, bə̀mvú ‘people’).

3. Tone system

The datasheet shows lexical and grammatical contrastive tones, marked directly on vowels and sonorants:

High tone (H): á, é, í, ó, ú, ɛ́, ɔ́, ń
Low tone (L): à, è, ì, ò, ù, ɛ̀, ɔ̀, ǹ
Falling contour tone (HL): â, ê, î, ô, û, ɛ̂, ɔ̂
Rising contour tone (LH): ǎ, ě, ǒ, ǔ, ɛ̌, ɔ̌
Mid / level tone: ā, ē, ī, ō, ū, ɛ̄, ɔ̄

Source

The dataset was collected through a questionnaire that was designed to gather basic information about the Ewondo lexicon and grammar. This was done as part of the Atlas Linguistique du Cameroun (ALCAM) project. This dataset contains specific linguistic data collected in an area called inhabited by a group that identifies as Fong, located southeast of Yaoundé and primarily spanning the Nyong-et-Mfoumou and Nyong-et-So'o divisions (around Akonolinga/Endom and Mbalmayo/Dzeng).

Domain

The dataset represents a linguistic questionnaire designed to elicit the basic lexicon and grammatical information.

Size

Total size is 16,80 MB

Structure

The dataset comprises: 1) a datasheet with 494 lines and 19 columns; 2) 438 voice clips read by a single female native speaker; 3) sentence-to-audio mapping with 438 lines and three columns.

Description of columns

#OrigID: original number of lexical entry on paper questionnaire
#EditID: modification of #OrigID
#FrenchRef: reference entry (originally provided in French)
#FrenchComm: Original comments about reference entry (#FrenchRef)
#French: Lexical entry in French (overlaps with #FrenchRef)
#Note: note of researcher on the lexical entry
#POS: part of speech
#Class: noun class (where applicable)
#Morph: morphological attribute (ex. plural, singular)
#Var: (na)
#Word: Lexical entry in Ewondo, Fong variety
#CrossRef: Cross-referencing of lexical entry number
#FrenchEx: Example sentence in French
#LangEx: Example sentence in Ewondo
#LangExEdit: manual editing of #LangEx
#LangPars: word for word parsing in Ewondo
#LangParsEdit: editing of #LangPars
#FrenchPars: French equivalent of #LangParsEdit -#FrenchParsdit: editing of #FrenchPars

Sample

audio files	words & sentences
3f7e60f716d0f74f789f41da8f2bd4ae.mp3	àbɔ̀; à bə̀lə́ bɔ̌tə̄ mə̀bɔ̀
96a514759c952393b618cb71bb7a3413.mp3	mə̀bɔ̀; à bə̀lə́ bɔ̀tə̄ mə̀bɔ̀
4f742540588e53638205174ce2385f57.mp3	ǹsɔ́t; à sòp ǹsɔ́t
7978916fc07b929550163580526928be.mp3	mìnsɔ́t;
e213925b9dcaf98bfd911cf4d504cedc.mp3	èbón; à sòp èbón
5d8ed0e696ac7917e225d068fe7627da.mp3	bìbón;
4af6597d4ae1bc9f8e054bd81c10f0d5.mp3	àkàn; à sòp mə̀kàn mə́mɔ́n wě
529042c9bcf4f4e3181309aa11f82e7a.mp3	mə̀kàn; à sòp mə̀kàn mə́mɔ́n wē
5bb894265b2c899abe44805c7fbc4072.mp3	àbum; ò bə̀gə́ ábùm ńdzálán mə́ndím
84875e4113ce25d9df93047348bd9814.mp3	àbum; ò bə̀ɣə́ ábùm ńdzálán mə́ndím
eae11274eca86ebb8872ef3260a12e77.mp3	mə̀bùm;
736ad51dc06bf749e0e3a7e0a5a64046.mp3	dɔ́p; mə̀ tâ dɔ́p mīníngá