Kannada Time Aligned Speech Corpus
License:
CC-BY-NC-SA-4.0
Steward:
MirasAITask: ASR
Release Date: 4/1/2026
Format: OGG, SRT
Size: 355.77 MB
Share
Description
The Kannada Time-Aligned Speech Corpus is a 5-hour speech dataset containing Kannada audio with corresponding time-aligned transcriptions. It is designed to support speech technology and research tasks such as automatic speech recognition, forced alignment, speech segmentation, pronunciation modeling, and spoken language analysis. The dataset provides a useful resource for developing and evaluating Kannada language technologies.
Specifics
Licensing
Creative Commons Attribution Non Commercial Share Alike 4.0 International (CC-BY-NC-SA-4.0)
https://spdx.org/licenses/CC-BY-NC-SA-4.0.htmlConsiderations
Restrictions/Special Constraints
Use is permitted with attribution for non-commercial purposes only, and any shared adaptations must be distributed under the same license terms.
Forbidden Usage
Forbidden uses include commercial use, redistribution without proper attribution, and sharing modified versions under a different license.
Processes
Intended Use
This dataset is intended for use in speech technology and language research, including automatic speech recognition, forced alignment, speech-text matching, and spoken Kannada language processing.
Metadata
Language
Kannada is a major Dravidian language primarily spoken in the Indian state of Karnataka and by Kannada-speaking communities in other parts of India and abroad. It has a long literary history, a rich written tradition, and its own script. Kannada is widely used in education, media, administration, literature, and everyday communication, making it one of the most important languages of South India.
Data Structure
The dataset is organized into two main folders:
Audio/ — contains the Kannada speech recordings
Transcription/ — contains the corresponding text transcriptions for each audio file
Each transcription file corresponds to an audio file, making the dataset easy to use for speech processing, alignment, and transcription-based tasks.
Speaker Information
The dataset includes recordings from two native Kannada speakers:
Speaker 1: Male, 32 years old
Speaker 2: Female, 39 years old
This provides basic speaker diversity in terms of gender and age within the corpus.
Recommended Processing
Verify audio quality
Normalize transcription text
Match audio and transcription filenames
Check alignment consistency
Remove noisy or corrupted files
Standardize formats and metadata
Sample
1
00:00:00,001 --> 00:00:02,956
ನಾನು ಇಂದು ಶಿಕ್ಷಣದ ಬಗ್ಗೆ ಮಾತನಾಡಲ್ಲ ಶಿಕ್ಷಣದ
2
00:00:02,980 --> 00:00:04,783
ಮಹತ್ವದ ಬಗ್ಗೆ
3
00:00:04,807 --> 00:00:06,031
ಮಾತನಾಡಲು ಹೊರಟಿದ್ದೇನೆ.