Flemishguy 1.0
License:
CC0-1.0
Steward:
Open Home Foundation
Task: TTS
Release Date: 12/6/2025
Format: FLAC
Size: 73.69 MB
Description
Text to speech dataset for Dutch, male speaker, approximately 1 hour of read speech.
Specifics
Considerations
Forbidden Usage
You agree not to attempt to determine the identity of speakers in the dataset
Processes
Ethical Review
Training and fine-tuning text-to-speech models
Metadata
Dutch (nl)
This dataset contains approximately 1 hour of scripted speech for Dutch (nl) from a single speaker.
Language
Dutch is a West Germanic language of the Indo-European language family. In Europe, Dutch is the native language of most of the population of the Netherlands and Flanders.
Variants
There are no variants defined for this dataset.
Demographic information
The age and gender of the speaker was not reported. Dataset names may be gendered, but were assigned according to the speaker's preference only.
Text corpus
The text corpus comes from Piper Recording Studio, which extends Microsoft's samples TTS scripts for Azure.
Microsoft provides the following recommendations:
To use these example scripts for training, it's recommended that you should do the sanity check to make sure it matches what the voice talent actually speaks in the audio and normalize the text before uploading the data. For example, change '50%' to fifty percent and '$45' to forty-five dollars. Normalization should apply to the scripts that contain digits, symbols, abbreviations, date, and time.
Statistics for the text corpus:
Average/median characters per sentence: 37.9/35
Average/median words per sentence: 7/7
Writing system
Dutch uses an extended Latin alphabet.
Symbol table
Standard alphabet:
Lowercase: a b c d e f g h i j k l m n o p q r s t u v w x y z è é ê ë ï
Uppercase: A B C D E F G H I J K L M N O P R S T U V W Z
Sample
5 randomly selected sentences:
Neem een ervaren gids mee op uw reis.
Ook het gebruik van een viool deed de band opvallen.
Sufheid kan optreden.
Je geeft ze bagage mee voor hun hele leven.
Je zou zeggen eerlijk is eerlijk.
Processing and validation
Audio was recorded online using Piper Recording Studio. No post-processing or validation was done to the text or audio.
Contribute
If you would like to contribute your voice and have us train a Piper text-to-speech model, please contact us at voice@openhomefoundation.org
Acknowledgements
We would like to thank all contributors, as well as supporters of the Open Home Foundation.
License
This dataset is released under the Creative Commons Zero (CC-0) license. By downloading this data you agree to not determine the identity of speakers in the dataset.
