Central Institute of Indian Languages [CIIL] MISSION STATEMENT:  Annotated, quality language data (both-text & speech) and tools in Indian Languages to Individuals, Institutions and Industry for Research & Development - Created in-house, through outsourcing and acquisition.  Our Other Sites  Related Sites 
You are here: BACK
Resources > Text Corpora
Text Corpora

Status of Text Corpora:( As on Feb 2012)

Sl. No.

Language

Current status of entire corpora (No. of words)

1.

Assamese

6751508

2.

Bengali

7208845

3.

Bodo

1211178

4.

Dogri

230082

5.

English

2383616

6.

Gujarati

4355969

7.

Hindi

31116970

8.

Kannada

7552162

9.

Kashmiri

990262

10.

Kodava

182741

11.

Konkani

2817518

12.

Maithili

2681832

13.

Malayalam

5218043

14.

Manipuri

3510123

15.

Marathi

2312239

16.

Nepali

6246425

17.

Oriya

1047204

18.

Punjabi

3953361

19.

Sanskrit

517642

20.

Tamil

9294292

21.

Urdu

5184915

22.

Yarava

13904

23.

Telugu

1911685


Sample Files of Text Corpora:

Sl. No. Language Sample Files:->          
    PDF files:     Doc files:    
1. Assamese 1 2 3 1 2 3
2. Bengali 1 2 3 1 2 3
3. Bodo 1 2 3 1 2 3
4. Dogri 1 2 3 1 2 3
5. Gujarati 1 2 3 1 2 3
6. Hindi 1 2 3 1 2 3
7. Kannada 1 2 3 1 2 3
8. Kashmiri 1 2 3 1 2 3
9. Konkani 1 2 3 1 2 3
10. Maithili 1 2 3 1 2 3
11. Malayalam 1 2 3 1 2 3
12. Manipuri 1 2 3 1 2 3
13. Marathi 1 2 3 1 2 3
14. Nepali 1 2 3 1 2 3
15. Oriya 1 2 3 1 2 3
16. Punjabi 1 2 3 1 2 3
17. Sanskrit 1 2 3 1 2 3
18. Santhali 1 2 3 1 2 3
19. Sindhi 1 2 3 1 2 3
20. Tamil 1 2 3 1 2 3
21. Telugu 1 2 3 1 2 3
22. Urdu 1 2 3 1 2 3

TOP BACK
You are visitor No.
WAIT...

Developed & Maintained by:
LDC-IL, CIIL
Copyright © LDC-IL,
Central Institute of Indian Languages
Central Institute of Indian Languages
Department of Higher Education
Ministry of Human Resource Development
Government of India
Manasagangothri, Hunsur Road, Mysore-570006, Karnataka, India.
Tel: (0821) 2515820 (Director)
Reception/PABX : (0821) 2345000
Fax: (0821) 2515032 (Off)
        Home | Announcements | News | CIIL | Contact Us