Datasets
Ewondo_Fong_ALCAM-MultimodalDataset
License: NOODL-1.0
Locale: ewo
Task: NLP
Format: MP3, TSV
Size: 16.80 MB
Gawri (گاؤری) Magazine Corpus
License: CC-BY-NC-4.0
Locale: gwc
Task: NLP
Format: TXT
Size: 146.71 KB
Adamawa Fulfulde-French Parallel Corpus of Narratives 1.2
License: NOODL-1.0
Locale: fub
Task: MT
Format: TSV
Size: 112.17 KB
Compar:IA conversations
License: Etalab 2.0
Locale: fr
Task: NLG
Format: PARQUET
Size: 1.81 GB
Jember Javanese Spontaneous Speech Corpus
License: CC-BY-NC-SA-4.0
Locale: jav
Task: ASR
Format: MP3, TSV
Size: 271.65 MB
Western Balochi Literature Cropus
License: CC-BY-NC-4.0
Locale: bgn
Task: NLP
Format: TXT
Size: 2.26 MB
smoltalk-chinese
License: Apache-2.0
Locale: zh
Task: LLM
Format: parquet
Size: 879.81 MB
Brahui Research Work Corpus
License: CC-BY-NC-SA-4.0
Locale: brh
Task: NLP
Format: TXT
Size: 1.13 MB
Kohistani Shina Word List
License: CC-BY-NC-4.0
Locale: plk
Task: NLP
Format: TXT
Size: 394.05 KB
Khowar Word List
License: CC-BY-NC-4.0
Locale: khw
Task: NLP
Format: TXT
Size: 64.22 KB
Khowar Literature Corpus by FLI
License: CC-BY-NC-4.0
Locale: khw
Task: NLP
Format: TXT
Size: 244.85 KB
Gojri Literature Corpus
License: CC-BY-NC-4.0
Locale: gju
Task: NLP
Format: TXT
Size: 117.97 KB
Chishti Sons Punjabi Literature Corpus
License: CC-BY-NC-4.0
Locale: pnb
Task: NLP
Format: TXT
Size: 1.65 MB
Baloch Publishers Saraiki Literature Corpus
License: CC-BY-NC-4.0
Locale: skr
Task: NLP
Format: TXT
Size: 2.04 MB
Kaleem Magazine Urdu Corpus
License: CC-BY-NC-4.0
Locale: urd
Task: NLP
Format: TXT
Size: 2.74 MB
Kaleem Art Press Urdu Literature Corpus
License: CC-BY-NC-4.0
Locale: ur
Task: OTH
Format: TXT
Size: 2.85 MB
Rana Printers Urdu Literature Corpus
License: CC-BY-NC-4.0
Locale: ur
Task: OTH
Format: TXT
Size: 3.00 MB
Kaleem Art Press Saraiki Literature Corpus
License: CC-BY-NC-4.0
Locale: skr
Task: OTH
Format: TXT
Size: 1.84 MB
Anjuman-e-Katib Farsi/Persian Literature Corpus
License: CC-BY-NC-4.0
Locale: fas
Task: NLP
Format: TXT
Size: 2.82 MB
Mediamen Punjabi Literature Corpus
License: CC-BY-NC-4.0
Locale: pnb
Task: NLP
Format: TXT
Size: 1.82 MB
Multilingual Religious Parallel Corpus (Kaleem Art Press)
License: CC-BY-SA-4.0
Locale: mul
Task: MT
Format: CSV
Size: 2.27 MB
Keblagh-e-Azergi Hazargi literature corpus
License: CC-BY-NC-4.0
Locale: haz
Task: NLP
Format: TXT
Size: 193.28 KB
Luhya ASR data subset 70 hours
License: CC-BY-4.0
Locale: luy
Task: ASR
Format: WAV, XLSX
Size: 13.90 GB
Aim Foundation Dari Literature Corpus
License: CC-BY-NC-4.0
Locale: prs
Task: NLP
Format: TXT
Size: 1.74 MB