Datasets

Common Voice

Common Voice v24 English - en-AU subset for Everything Open 2026

Common Voice v24 English filtered on the `accent` field for Australian-related accents.
License Icon

License: CC0-1.0

Locale Icon

Locale: en-AU

Task Icon

Task: ASR

Format Icon

Format: CSV, MP3

Size Icon

Size: 1.92 GB

Common Voice

Common Voice Spontaneous Speech 2.0 - Irish

A collection of spontaneous spoken phrases in Irish.
License Icon

License: CC0-1.0

Locale Icon

Locale: ga-IE

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 3.13 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Galician

A collection of spontaneous spoken phrases in Galician.
License Icon

License: CC0-1.0

Locale Icon

Locale: gl

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 23.40 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Alsatian

A collection of spontaneous spoken phrases in Alsatian.
License Icon

License: CC0-1.0

Locale Icon

Locale: gsw

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 85.53 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Manx

A collection of spontaneous spoken phrases in Manx.
License Icon

License: CC0-1.0

Locale Icon

Locale: gv

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 15.35 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Gorani

A collection of spontaneous spoken phrases in Gorani.
License Icon

License: CC0-1.0

Locale Icon

Locale: hac

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 224.46 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Wixárika

A collection of spontaneous spoken phrases in Wixárika.
License Icon

License: CC0-1.0

Locale Icon

Locale: hch

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 198.80 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Georgian

A collection of spontaneous spoken phrases in Georgian.
License Icon

License: CC0-1.0

Locale Icon

Locale: ka

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 11.57 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Kabardian

A collection of spontaneous spoken phrases in Kabardian.
License Icon

License: CC0-1.0

Locale Icon

Locale: kbd

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 162.64 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Nubi

A collection of spontaneous spoken phrases in Nubi.
License Icon

License: CC0-1.0

Locale Icon

Locale: kcn

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 283.63 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Ligurian

A collection of spontaneous spoken phrases in Ligurian.
License Icon

License: CC0-1.0

Locale Icon

Locale: lij

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 48.21 MB

Common Voice

Common Voice Scripted Speech 24.0 - Latvian

A collection of scripted spoken phrases in Latvian.
License Icon

License: CC0-1.0

Locale Icon

Locale: lv

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 5.79 GB

Common Voice

Common Voice Spontaneous Speech 2.0 - Kenyi

A collection of spontaneous spoken phrases in Kenyi.
License Icon

License: CC0-1.0

Locale Icon

Locale: lke

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 251.32 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Thur

A collection of spontaneous spoken phrases in Thur.
License Icon

License: CC0-1.0

Locale Icon

Locale: lth

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 292.98 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Latvian

A collection of spontaneous spoken phrases in Latvian.
License Icon

License: CC0-1.0

Locale Icon

Locale: lv

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 5.10 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Mixteco Yucuhiti

A collection of spontaneous spoken phrases in Mixteco Yucuhiti.
License Icon

License: CC0-1.0

Locale Icon

Locale: meh

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 201.66 MB

Common Voice

Mozilla Common Voice Text Language Identification dataset

A dataset for text-based language identification of 19 Million sentences from over 300 languages taken from Mozilla Common Voice scripted (v23) and spontaneous (v1) speech projects.
License Icon

License: CC0-1.0

Locale Icon

Locale: mul

Task Icon

Task: NLP

Format Icon

Format: TSV

Size Icon

Size: 950.41 MB

Common Voice

Common Voice Scripted Speech 24.0 - Mbo

A collection of scripted spoken phrases in Mbo.
License Icon

License: CC0-1.0

Locale Icon

Locale: mbo

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 242.40 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Melanau

A collection of spontaneous spoken phrases in Melanau.
License Icon

License: CC0-1.0

Locale Icon

Locale: mel

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 208.47 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Michoacán Mazahua

A collection of spontaneous spoken phrases in Michoacán Mazahua.
License Icon

License: CC0-1.0

Locale Icon

Locale: mmc

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 225.51 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Bahasa Malay

A collection of spontaneous spoken phrases in Bahasa Malay.
License Icon

License: CC0-1.0

Locale Icon

Locale: ms-MY

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 125.94 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Sabah Malay

A collection of spontaneous spoken phrases in Sabah Malay.
License Icon

License: CC0-1.0

Locale Icon

Locale: msi

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 275.80 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Western Penan

A collection of spontaneous spoken phrases in Western Penan.
License Icon

License: CC0-1.0

Locale Icon

Locale: pne

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 247.12 MB

Common Voice

Common Voice Spontaneous Speech 2.0 - Puno Quechua

A collection of spontaneous spoken phrases in Puno Quechua.
License Icon

License: CC0-1.0

Locale Icon

Locale: qxp

Task Icon

Task: ASR

Format Icon

Format: MP3

Size Icon

Size: 178.68 MB