Thorsten-Voice-44kHz-Full

License icon

License:

CC0-1.0

Shield icon

Steward:

Community

Task: TTS

Release Date: 2/27/2026

Format: WAV,PARQUET

Size: 7.99 GB


Share

Description

TV-44kHz-Full is a high-quality German speech dataset containing approximately 40 hours of transcribed recordings (38,000+ files) by Thorsten Müller, a single native male speaker. It combines multiple Thorsten-Voice subsets (neutral, emotional, and Hessian dialect) in original 44.1 kHz sampling rate. The dataset is released under CC0 to enable unrestricted research and commercial use.

Specifics

Licensing

Creative Commons Zero v1.0 Universal (CC0-1.0)

https://spdx.org/licenses/CC0-1.0.html

Considerations

Restrictions/Special Constraints

None. Released under CC0 (public domain dedication).

Forbidden Usage

None from the licensor’s side. Users are responsible for complying with applicable laws and ethical standards.

Processes

Ethical Review

The dataset consists exclusively of voluntary recordings of the contributor’s own voice. No third-party voices or personal data are included. All recordings were created with the explicit intention of unrestricted public release under CC0. No formal institutional ethical review was required, as the dataset contains only self-recorded material. The dataset was released in the spirit of openness, equality, and free access to knowledge. The contributor encourages responsible and socially beneficial use.

Intended Use

Intended for high-quality text-to-speech (TTS), expressive and dialectal speech synthesis, speech research, benchmarking, and commercial speech technology development.

Metadata

Overview

This dataset contains approximately 40 hours of German speech recordings (38,000+ WAV files) by a single native male speaker.

  • Mono

  • 44,100 Hz sample rate (original recording rate)

  • Denoised

  • Normalized to -24 dB

  • Trimmed silence at beginning and end

Structure

Provided in Hugging Face Datasets format (Parquet-based structure).

Attributes include:

  • audio

  • id

  • subset

  • style

  • text

  • samplerate

  • durationSeconds

  • charsPerSecond

  • recording_year-month

  • microphone

  • speaker

  • language

  • comment

Optimized for use with the Hugging Face datasets library.

Included Subsets

  • TV-2021.02-Neutral (~22,000 recordings)

  • TV-2022.10-Neutral (~12,000 recordings)

  • TV-2021.06-Emotional (~2,000 recordings)

  • TV-2023.09-Hessisch (~2,000 recordings)

All subsets are provided in original 44.1 kHz quality.

Styles

  • Neutral

  • Hessisch (regional dialect of Hessen, Germany)

  • Emotional styles:

    • surprised

    • disgusted

    • drunk-style (recorded sober)

    • angry

    • amused

    • whisper

    • sleepy

Licensing

Released under CC0 (public domain dedication). No restrictions apply.

Citation

Müller, Thorsten (2024). TV-44kHz-Full. Hugging Face. DOI: 10.57967/hf/3290

Project Context

Thorsten-Voice is an open initiative to provide high-quality German speech datasets and TTS models as free and unrestricted resources.

Contributor Statement

“I contribute my personal voice as a person believing in a world where all people are equal, regardless of gender, sexual orientation, religion, skin color, or geographic coordinates of birth. I believe in a global world where everyone is welcome everywhere and where free knowledge and education is accessible to all.”