Skip to main content | Skip to Navigation | Text Size : | Language:

logo of Linguistic Data Consortium for Indian Languages (LDC-IL)
Sentence Aligned Speech Corpus | Official Website of Linguistic Data Consortium for Indian Languages

Sentence Aligned Speech Corpus

Slno Language Duration(hh:mm:ss) Speakers Size(in GB) Sample
1 ASSAMESE 30:18:16 304 19.5 Link
2 BENGALI 69:10:03 450 43.3 Link
3 HINDI 72:34:52 473 45.9 Link
4 KANNADA 107:48:50 600 69.4 Link
5 KONKANI 83:19:42 487 53.5 Link
6 MAITHILI 41:54:30 300 26 Link
7 MALAYALAM 123:29:55 451 79.6 Link
8 MARATHI 41:34:04 302 26.7 Link
9 NEPALI 43:04:23 346 27.7 Link
10 TAMIL 74:57:59 433 46.4 Link
11 URDU 50:09:56 434 32.3 Link
12 Indian English - Bengali Variant 09:21:08 52 5.53 Link
13 Indian English - Kannada Variant 11:17:40 53 7.27 Link
14 Dogri 08:32:54 61 5.6 Link
15 Maithili (Tirhuta Script) 41:54:30 300 26 Link
16 Manipuri (Bengali Script ) 116:34:24 589 75.9 Link
17 Manipuri (Meetei Mayek ) 116:34:24 589 75.9 Link
18 Punjabi 52:24:51 449 34.8 Link
19 Telugu 15:38:53 80 10.1 Link