Chitwan 1.0
License:
CC0-1.0
Steward:
Open Home FoundationTask: TTS
Release Date: 12/6/2025
Format: WEBM
Size: 61.68 MB
Share
Description
Text to speech dataset for Nepali, male speaker, approximately 1 hour of read speech.
Specifics
Considerations
Forbidden Usage
You agree not to attempt to determine the identity of speakers in the dataset
Processes
Intended Use
Training and fine-tuning text-to-speech models
Metadata
Nepali (ne)
This dataset contains approximately 1 hour of scripted speech for Nepali (ne) from a single speaker.
Language
Nepali is an Indo-Aryan language, native to the Himalayan region of South Asia. It is the official and most-widely spoken language of Nepal.
Variants
There are no variants defined for this dataset.
Demographic information
The age and gender of the speaker was not reported. Dataset names may be gendered, but were assigned according to the speaker's preference only.
Text corpus
The text corpus comes from Piper Recording Studio, which originally comes from Common Voice.
Statistics for the text corpus:
Average/median characters per sentence: 29/28
Average/median words per sentence: 5.8/6
Writing system
Nepali uses the Devanagari script.
Sample
5 randomly selected sentences:
यति सहमति भएपछि म पनि पोखराबाट आउनेहरूलाई पुलमै पर्खेर बसेँ ।
यी खानाहरूको बीचमा चिउरा तथा चियाको सेवन पनि गरिन्छ।
२९
रुघा चाडै निको होस् हजुरलाई ।
१६
Processing and validation
Audio was recorded online using Piper Recording Studio. No post-processing or validation was done to the text or audio.
Trained models
A pre-trained Piper voice model is available for download.
Contribute
If you would like to contribute your voice and have us train a Piper text-to-speech model, please contact us at voice@openhomefoundation.org
Acknowledgements
We would like to thank all contributors, as well as supporters of the Open Home Foundation.
License
This dataset is released under the Creative Commons Zero (CC-0) license. By downloading this data you agree to not determine the identity of speakers in the dataset.