Central Institute of Indian Languages [CIIL] MISSION STATEMENT:  Annotated, quality language data (both-text & speech) and tools in Indian Languages to Individuals, Institutions and Industry for Research & Development - Created in-house, through outsourcing and acquisition.  Our Other Sites  Related Sites 
You are here: BACK
Resources > Comparable Text Corpora
Comparable Text Corpora

Status of Comparable Text Corpora (with English):

Sl. No.

Language

No. of Words

1.

Bengali

145845

2.

Dogri

267393

3.

Hindi

3609281

4.

Kannada

873634

5.

Kodava

182741

6.

Maithili

225686

7.

Nepali

301790


  Sample files  
TOP BACK
You are visitor No.
WAIT...

Developed & Maintained by:
LDC-IL, CIIL
Copyright © LDC-IL,
Central Institute of Indian Languages
Central Institute of Indian Languages
Department of Higher Education
Ministry of Human Resource Development
Government of India
Manasagangothri, Hunsur Road, Mysore-570006, Karnataka, India.
Tel: (0821) 2515820 (Director)
Reception/PABX : (0821) 2345000
Fax: (0821) 2515032 (Off)
        Home | Announcements | News | CIIL | Contact Us