Datasets

Filters:
Search results for “podcast”
Community

Podcast Homostoria (Indonesia)

This dataset features discussions on modern media—including film, podcasts, and social media—and its connection to local customs and traditions. The conversations are primarily in Indonesian, with frequent code-switching between English and Javanese.
License Icon

License: CC-BY-SA-4.0

Locale Icon

Locale: id

Task Icon

Task: ASR

Format Icon

Format: mp3

Size Icon

Size: 302.97 MB

Community

Podcast Hari Minggoean (Indonesia)

This dataset is derived from the "Hari Minggoean" podcast, featuring over ten hours of recorded speech from a single, consistent speaker. The content, tailored for a young Indonesian audience, is presented in Indonesian (Bahasa Indonesia) characterized by code-switching with English and a discernible Javanese accent. The collection is comprised of 42 individual audio files (10+ hours). Sample Tapi dari pelafalan, dari intonasi, dari jedanya dia bicara. That's really good. Dan aku sebenarnya suka banget ketika dia ngomong. Yang btw, soal ekspresif tadi. Aku jadi kepikiran deh.
License Icon

License: CC-BY-SA-4.0

Locale Icon

Locale: id-ID

Task Icon

Task: ASR

Format Icon

Format: mp3

Size Icon

Size: 338.92 MB

Open Home Foundation

Kerstin 1.0

Text to speech dataset for German, female speaker, approximately 2 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: de-DE

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 132.05 MB

Open Home Foundation

Pim 1.0

Text to speech dataset for Dutch, male speaker, approximately 2 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: nl-NL

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 108.08 MB

Open Home Foundation

Tugão 1.0

Text to speech dataset for Portuguese, male speaker, approximately 1.5 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: pt-PT

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 61.84 MB

Open Home Foundation

Faber 1.0

Text to speech dataset for Brazilian Portuguese, male speaker, approximately 1.5 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: pt-BR

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 30.98 MB

Open Home Foundation

Nathalie 1.0

Text to speech dataset for Dutch, female speaker, approximately 1 hour of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: nl-BE

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 21.87 MB

Open Home Foundation

Jeff 1.0

Text to speech dataset for Brazilian Portuguese, male speaker, approximately 1.5 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: pt-BR

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 90.74 MB

Open Home Foundation

Joe 1.0

Text to speech dataset for English, male speaker, approximately 1 hour of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: en-US

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 75.78 MB

Community

Thorsten-Voice Dataset 2021.02

German neutral speech dataset (22,668 phrases, 23+ hours), CC0 licensed, LJSpeech-compatible, for TTS research and development.
License Icon

License: CC0-1.0

Locale Icon

Locale: de-DE

Task Icon

Task: TTS

Format Icon

Format: WAV, CSV

Size Icon

Size: 2.55 GB

Open Home Foundation

Gosia 1.0

Text to speech dataset for Polish, female speaker, approximately 2 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: pl-PL

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 39.75 MB

Open Home Foundation

Cadu 1.0

Text to speech dataset for Brazilian Portuguese, male speaker, approximately 1.5 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: pt-BR

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 30.98 MB

Open Home Foundation

Kathleen 1.0

Text to speech dataset for English, female speaker, approximately 1 hour of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: en-US

Task Icon

Task: TTS

Format Icon

Format: FLAC

Size Icon

Size: 211.96 MB

Open Home Foundation

Ronnie 1.0

Text to speech dataset for Dutch, male speaker, approximately 2 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: nl-NL

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 106.23 MB

Open Home Foundation

Flemishguy 1.0

Text to speech dataset for Dutch, male speaker, approximately 1 hour of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: nl-BE

Task Icon

Task: TTS

Format Icon

Format: FLAC

Size Icon

Size: 73.69 MB

Open Home Foundation

Dave 1.0

Text to speech dataset for Spanish, male speaker, approximately 1.5 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: es-ES

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 85.24 MB

Open Home Foundation

Darkman 1.0

Text to speech dataset for Polish, male speaker, approximately 2 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: pl-PL

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 40.42 MB

Community

Thorsten-Voice Dataset 2022.10

German neutral speech dataset (12,450 phrases, 11+ hours), CC0 licensed, LJSpeech-compatible, for TTS research and development.
License Icon

License: CC0-1.0

Locale Icon

Locale: de-DE

Task Icon

Task: TTS

Format Icon

Format: WAV,CSV

Size Icon

Size: 1.30 GB

Open Home Foundation

Denis 1.0

Text to speech dataset for Russian, male speaker, approximately 2 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: ru-RU

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 104.52 MB

Open Home Foundation

Berta 1.0

Text to speech dataset for Hungarian, female speaker, approximately 1 hour of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: hu-HU

Task Icon

Task: TTS

Format Icon

Format: FLAC

Size Icon

Size: 209.52 MB

Community

Thorsten-Voice Dataset 2023.09 Hessisch

German regional dialect speech dataset (Hessisch, 2,108 phrases), CC0 licensed, 22,050 Hz mono WAV, for TTS and speech research.
License Icon

License: CC0-1.0

Locale Icon

Locale: de-DE

Task Icon

Task: TTS

Format Icon

Format: WAV,CSV

Size Icon

Size: 255.96 MB

Community

Thorsten-Voice-44kHz-Full

German speech dataset (44.1 kHz, 38k+ files, ~40 hours), CC0 licensed, multi-style (neutral, emotional, dialect), for TTS research.
License Icon

License: CC0-1.0

Locale Icon

Locale: de-DE

Task Icon

Task: TTS

Format Icon

Format: WAV,PARQUET

Size Icon

Size: 7.99 GB

Open Home Foundation

Mihai 1.0

Text to speech dataset for Romanian, male speaker, approximately 2 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: ro-RO

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 66.31 MB

Open Home Foundation

Lili 1.0

Text to speech dataset for Slovak, female speaker, approximately 2 hours of read speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: sk-SK

Task Icon

Task: TTS

Format Icon

Format: WEBM

Size Icon

Size: 72.38 MB