TTS Muna Dataset

License icon

License:

CC-BY-NC-SA-4.0

Shield icon

Steward:

Community

Task: TTS

Release Date: 1/30/2026

Format: WEBM & TSV

Size: 316.34 MB


Share

Description

The Muna language, locally known as Wuna, is an Austronesian language spoken by approximately 300,000 people in Southeast Sulawesi, primarily on Muna Island, large parts of West Muna, and the western coastal areas of Buton and Central Buton, Indonesia. According to data from Ethnologue and the Kemendikbud Language Map, this language has an extensive distribution but faces sustainability challenges, currently holding a Threatened status as it is increasingly less common for it to be actively passed down to the youngest generation. Internally, the Muna language possesses a rich diversity of dialects that reflect the historical migration of its people, including Standard Muna in central areas like Raha, the Gu-Mawasangka dialect in Central Buton, and specialized dialects on smaller islands such as Kadatua and Siompu. Furthermore, the Tiworo dialect is specifically spoken in the northwest or the Tiworo Archipelago and West Muna regions, featuring distinct lexical characteristics that set it apart from the central dialect, while the Lohia dialect is rooted in the Lohia District with unique phonological traits and a sharp accent. The existence of the Tiworo and Lohia dialects emphasizes that while Muna appears to be a single entity, there are strong local linguistic variations officially recognized by both Ethnologue and Kemendikbud documentation. All these dialects form a cohesive Muna language ecosystem that is grammatically complex, particularly in its use of verb conjugation systems based on subjects and temporal aspects, making it one of the most unique indigenous languages in the Celebic region.

Specifics

Licensing

Creative Commons Attribution Non Commercial Share Alike 4.0 International (CC-BY-NC-SA-4.0)

https://spdx.org/licenses/CC-BY-NC-SA-4.0.html

Considerations

Restrictions/Special Constraints

This dataset is intended for research, education, and cultural preservation purposes.

Forbidden Usage

This dataset may not be used for commercial purposes, modified in format, or reproduced in any other form.

Processes

Ethical Review

The dataset creator has obtained permission from the South East Sulawesi Provincial Language Council (Balai Bahasa Provinsi Sulawesi Tenggara), under the Ministry of Primary and Secondary Education of Indonesia, to utilize Pabitara Magazine and the Children’s Stories published between 2018 and 2025. The magazines and children's stories were originally in PDF format and have been converted into TXT files. The TXT files were read and recorded by native Muna speakers through the hosting platform https://sabre-2.onrender.com/. The collection of audio recordings was compiled into a comprehensive dataset.

Intended Use

These magazines and children's storybooks are intended for research, education, as well as cultural archiving and preservation. The same objectives apply to this TTS dataset.

Metadata

Language:

The languages used in this dataset are the Muna language, specifically the Muna, Tiworo, and Lohia dialects. It is spoken by native Muna speakers in their 20s to 30s. In the texts converted into speech, there is code-mixing between the Muna and Indonesian languages.

Source(s):

The dataset is derived from 11 issues of Pabitara Magazine and 8 Children’s Storybooks published by Balai Bahasa Provinsi Sulawesi Tenggara between 2018 and 2025. These materials were issued under the Ministry of Primary and Secondary Education of Indonesia.

The dataset is derived from 11 issues of Pabitara Magazine and 8 Children’s Storybooks published by Balai Bahasa Provinsi Sulawesi Tenggara between 2018 and 2025. These materials were issued under the Ministry of of Primary and Secondary Education of Indonesia.

The article section consists of 11 manuscripts.

1."Tari Mangaru dalam Ritual Matogalampa di Rongi" which tells the story of the traditional ritual of bravery in South Buton.

2."Islam sebagai Keyakinan Orang Laut (Suku Bajo)" which discusses the religious systems and beliefs of maritime communities.

3."Mengenal Potensi Wisata Budaya Tolaki di Kota Kendari" which explores the cultural heritage of the Tolaki tribe in the provincial capital.

4."Merawat Budaya, Menelusuri Bahasa" which focuses on efforts to preserve endangered local languages in Baubau City.

5."Festival Musikalisasi Puisi 2019" which illustrates the harmony between literature and music in a creative event.

6."Majelis Para Penyair" which provides a critical review of a film regarding the world of poetry.

7."Kekayaan Alam dan Kaum Migran" which details the history of regional naming in Southeast Sulawesi.

8."Kantola, Tradisi Lisan Masyarakat Muna" which documents the authentic oral literature of the Muna people.

9."Dari Berburu hingga Bertani" which recounts a fragment of the history of livelihoods in Bombana Regency.

10."Cerak Leppa" which describes the ritual of launching a new boat in the Bajo community.

11."Pengobatan Tradisional Kagombe-Gombe" which explains a spiritual healing ritual for shingles based on the Sufi concept of the "Seven Dignities" (Martabat Tujuh).

Similarly, the children's stories consist of 8 texts.

1."Berapa Tinggi Gunung Itu?" which tells of Eli the rabbit who learns that the journey of adventure is more valuable than the final result.

2."Buku Gambar Tama" which shares Tama's pride in introducing the Tokotua Traditional House of the Moronene Tribe through his drawings.

3."Mimpi Ayu Menari" which depicts the persistent spirit of Wa Ayu in pursuing her dream of dancing despite limited musical instruments.

4."Anak Laut" which follows Farhan's struggle to protect coral reefs from plastic waste on Kabaena Island.

5."Kemenangan Bersama" which highlights the cooperation of a group of friends in finding coconut shells to participate in the traditional Muna game, Kalego.

6."Menyelamatkan Burung Bungkoloko" which teaches about Ode's responsibility to repair a bird's nest damaged by his own actions.

7."Bungkus Jajanan Ima" which recounts a nightmare that makes Ima realize the negative impact of littering in the river.

8."Atraksi Sori" which focuses on character building and local cultural values through the unique experiences of a character named Sori.

Domain(s):

The themes encompass cultural heritage and local children’s stories from Southeast Sulawesi, Indonesia.

Size:

11 Magazines, 8 Children's Storybooks, 5.5 hours of TTS.

Structure:

Audio file name, text

Sample:

"Anaghaaini tandahano taghu 1950-an (seriwu siuamoghono limafulu) kadadiha newviteno moronene (Bombana) nofekundo sepaliha."

"Karangkaea kanandono mina kaawu okarangkaeano alam, Sampedua karangkaeano sabharan (budaya)."

"Indonesia liwu rumangkaeano."

"Kanando'ono Dhunia bhe Mebali'ino Kaelateha: Te Koneagho'ono Liwu ne Sulawesi Tenggara."

"Sulawesi Tenggara, nokoneagho liwu no Kadue bhe ne ko kalabiagho'ono liwu bhe no tumbu nekahobuto."

Writing System:

Latin alphabet (A–Z), Arabic numerals (0–9)

Useful Link:

Magazine link: https://balaibahasasultra.kemendikdasmen.go.id/pabitara/

Children's storybook link: https://balaibahasasultra.kemendikdasmen.go.id/cerita-anak/

Muna language researcher link: https://www.bahasamuna.org/ind/home

Language Map of Kemendikbud: https://petabahasa.kemdikbud.go.id/