Datasets

Filters:
Search results for “common voice ”
Common Voice

Common Voice v24 English - en-AU subset for Everything Open 2026

Common Voice v24 English filtered on the `accent` field for Australian-related accents.
License Icon

License: CC0-1.0

Locale Icon

Locale: en-AU

Task Icon

Task: ASR

Format Icon

Format: CSV, MP3

Size Icon

Size: 1.92 GB

Common Voice

Mozilla Common Voice Spontaneous Speech ASR Shared Task Train/Dev Data

This datasheet is for the bundle of Mozilla Common Voice spontaneous speech datasets to be used in the Shared Task on Spontaneous Speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: mul

Task Icon

Task: ASR

Format Icon

Format: mp3

Size Icon

Size: 4.30 GB

Common Voice

Mozilla Common Voice Spontaneous Speech ASR Shared Task Test Data

A bundle of the held-out test data for the Mozilla Common Voice Spontaneous Speech ASR shared task.
License Icon

License: CC0-1.0

Locale Icon

Locale: mul

Task Icon

Task: ASR

Format Icon

Format: MP3, TSV

Size Icon

Size: 784.80 MB

Common Voice

Mozilla Common Voice Text Language Identification dataset

A dataset for text-based language identification of 19 Million sentences from over 300 languages taken from Mozilla Common Voice scripted (v23) and spontaneous (v1) speech projects.
License Icon

License: CC0-1.0

Locale Icon

Locale: mul

Task Icon

Task: NLP

Format Icon

Format: TSV

Size Icon

Size: 950.41 MB

Common Voice

Common Voice 7.0 - Single Word Target Segment

This dataset contains the numbers 0 to 9 and the words "yes" and "no" in 34 languages. It contains 84 validated hours of speech.
License Icon

License: CC0-1.0

Locale Icon

Locale: mul

Task Icon

Task: ASR

Format Icon

Format: TSV, MP3

Size Icon

Size: 3.51 GB

Common Voice

Common Voice Spontaneous Speech 2.0 - Wixárika

A collection of spontaneous spoken phrases in Wixárika.
License Icon

License: CC0-1.0

Locale Icon

Locale: hch

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 198.80 MB

Common Voice

Common Voice Scripted Speech 24.0 - Mbum

A collection of scripted spoken phrases in Mbum.
License Icon

License: CC0-1.0

Locale Icon

Locale: mdd

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 199.04 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Gorani

A collection of spontaneous spoken phrases in Gorani.
License Icon

License: CC0-1.0

Locale Icon

Locale: hac

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 224.46 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Puno Quechua

A collection of spontaneous spoken phrases in Puno Quechua.
License Icon

License: CC0-1.0

Locale Icon

Locale: qxp

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 178.68 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Kabardian

A collection of spontaneous spoken phrases in Kabardian.
License Icon

License: CC0-1.0

Locale Icon

Locale: kbd

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 162.64 MB

Common Voice

Common Voice Scripted Speech 24.0 - Asturian

A collection of scripted spoken phrases in Asturian.
License Icon

License: CC0-1.0

Locale Icon

Locale: ast

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 40.89 MB

Common Voice

Common Voice Scripted Speech 24.0 - Moksha

A collection of scripted spoken phrases in Moksha.
License Icon

License: CC0-1.0

Locale Icon

Locale: mdf

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 10.54 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Bodo

A collection of spontaneous spoken phrases in Bodo.
License Icon

License: CC0-1.0

Locale Icon

Locale: brx

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 1.29 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Nubi

A collection of spontaneous spoken phrases in Nubi.
License Icon

License: CC0-1.0

Locale Icon

Locale: kcn

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 283.63 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Galician

A collection of spontaneous spoken phrases in Galician.
License Icon

License: CC0-1.0

Locale Icon

Locale: gl

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 23.40 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Kenyi

A collection of spontaneous spoken phrases in Kenyi.
License Icon

License: CC0-1.0

Locale Icon

Locale: lke

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 251.32 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Sabah Bisaya

A collection of spontaneous spoken phrases in Sabah Bisaya.
License Icon

License: CC0-1.0

Locale Icon

Locale: bsy

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 219.99 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Georgian

A collection of spontaneous spoken phrases in Georgian.
License Icon

License: CC0-1.0

Locale Icon

Locale: ka

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 11.57 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Thur

A collection of spontaneous spoken phrases in Thur.
License Icon

License: CC0-1.0

Locale Icon

Locale: lth

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 292.98 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Bukusu

A collection of spontaneous spoken phrases in Bukusu.
License Icon

License: CC0-1.0

Locale Icon

Locale: bxk

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 258.53 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Catalan

A collection of spontaneous spoken phrases in Catalan.
License Icon

License: CC0-1.0

Locale Icon

Locale: ca

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 11.78 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Eastern Min

A collection of spontaneous spoken phrases in Eastern Min.
License Icon

License: CC0-1.0

Locale Icon

Locale: cdo

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 190.61 MB

Common Voice

Common Voice Scripted Speech 24.0 - Latvian

A collection of scripted spoken phrases in Latvian.
License Icon

License: CC0-1.0

Locale Icon

Locale: lv

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 5.79 GB

Common Voice

Common Voice Spontaneous Speech 2.0 - Ligurian

A collection of spontaneous spoken phrases in Ligurian.
License Icon

License: CC0-1.0

Locale Icon

Locale: lij

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 48.21 MB