Common Voice 7.0 - Single Word Target Segment

License:

CC0-1.0

Steward:

Common Voice

Task: ASR

Release Date: 1/29/2026

Format: TSV, MP3

Size: 3.51 GB

Description

This dataset contains the numbers 0 to 9 and the words "yes" and "no" in 34 languages. It contains 84 validated hours of speech and 142 hours in total. Its use case driven is to enable spoken digit recognition and yes / no detection. It was collected as part of the Common Voice 7.0 release.

Specifics

Licensing

Creative Commons Zero v1.0 Universal (CC0-1.0)

https://spdx.org/licenses/CC0-1.0.html

Considerations

Restrictions/Special Constraints

n/a

Forbidden Usage

It is forbidden to attempt to determine the identity of speakers in the common Voice datasets. It is forbidden to re-host or re-share this dataset.

Processes

Intended Use

This dataset is intended to be used for training and evaluating automatic speech recognition (ASR) models. It may also be used for applications relating to computer-aided language learning (CALL) and language or heritage revitalisation.

Metadata

Languages

Abhaz (ab)
Arabic (ar)
Breton (br)
Catalan (ca)
Czech (cs)
Chuvash (cv)
Welsh (cy)
German (de)
English (en)
Esperanto (eo)
Spanish (es)
Basque (eu)
French (fr)
Frisian (fy-NL)
Indonesian (id)
Japanese (ja)
Georgian (ka)
Kabyle (kab)
Kyrgyz (ky)
Luganda (lg)
Dutch (nl)
Odia (or)
Polish (pl)
Portuguese (pt)
Russian (ru)
Kinyarwanda (rw)
Swedish (sv-SE)
Tamil (ta)
Thai (th)
Turkish (tr)
Tatar (tt)
Chinese [China] (zh-CN)
Chinese [Hong Kong] (zh-HK)
Chinese [Taiwan] (zh-TW)

References

Help create Common Voice’s first target segment