Skip to main content | Skip to Navigation | Text Size : | Language:

Data distribution system poster

shabd sadhan poster

MISSION STATEMENT:

Annotated, quality language data (both text & speech) and tools in Indian Languages to Individuals, Institutions and Industry for Research & Development - Created in-house, through outsourcing and acquisition..

Datasets (Linguistic Resources)

Speech

Assamese Sentence Aligned Speech Corpus cover page

Assamese Sentence Aligned Speech Corpus

Bengali Sentence Aligned Speech Corpus cover page

Bengali Sentence Aligned Speech Corpus

Hindi Sentence Aligned Speech Corpus cover page

Hindi Sentence Aligned Speech Corpus

Kannada Sentence Aligned Speech Corpus cover page

Kannada Sentence Aligned Speech Corpus

Konkani Sentence Aligned Speech Corpus cover page

Konkani Sentence Aligned Speech Corpus

Maithili Sentence Aligned Speech Corpus cover page

Maithili Sentence Aligned Speech Corpus

Malayalam Sentence Aligned Speech Corpus cover page

Malayalam Sentence Aligned Speech Corpus

Marathi Sentence Aligned Speech Corpus cover page

Marathi Sentence Aligned Speech Corpus

Nepali Sentence Aligned Speech Corpus cover page

Nepali Sentence Aligned Speech Corpus

Odia Sentence Aligned Speech Corpus cover page

Odia Sentence Aligned Speech Corpus

Tamil Sentence Aligned Speech Corpus cover page

Tamil Sentence Aligned Speech Corpus

Urdu Sentence Aligned Speech Corpus cover page

Urdu Sentence Aligned Speech Corpus

Indian English-Bengali variant Sentence Aligned Speech Corpus cover page

Indian English-Bengali variant Sentence Aligned Speech Corpus

Indian English-Kannada variant Sentence Aligned Speech Corpus cover page

Indian English-Kannada variant Sentence Aligned Speech Corpus

Chhattisgarhi Raw Speech Corpus cover page

Chhattisgarhi Raw Speech Corpus

Assamese Raw Speech Corpus cover page

Assamese Raw Speech Corpus

Dogri Raw Speech Corpus cover page

Dogri Raw Speech Corpus

Gujarati Raw Speech Corpus cover page

Gujarati Raw Speech Corpus

Gujarati Raw Speech Corpus (Mono Recordings) cover page

Gujarati Raw Speech Corpus (Mono Recordings)

Indian English Raw Speech Corpus - Bengali Variant cover page

Indian English Raw Speech Corpus - Bengali Variant

Indian English Raw Speech Corpus - Kannada Variant cover page

Indian English Raw Speech Corpus - Kannada Variant

Kashmiri Raw Speech Corpus cover page

Kashmiri Raw Speech Corpus

Mulitilingual Raw Speech Corpus cover page

Mulitilingual Raw Speech Corpus

Odia Raw Speech Corpus cover page

Odia Raw Speech Corpus

Tamil Raw Speech Corpus cover page

Tamil Raw Speech Corpus

Bengali Raw Speech Corpus. cover page

Bengali Raw Speech Corpus

Bodo Raw Speech Corpus. cover page

Bodo Raw Speech Corpus

Hindi Raw Speech Corpus. cover page

Hindi Raw Speech Corpus

Kannada Raw Speech Corpus. cover page

Kannada Raw Speech Corpus

Konkani Raw Speech Corpus. cover page

Konkani Raw Speech Corpus

Maithili Raw Speech Corpus. cover page

Maithili Raw Speech Corpus

Marathi Raw Speech Corpus. cover page

Marathi Raw Speech Corpus

Nepali Raw Speech Corpus. cover page

Nepali Raw Speech Corpus

Punjabi Raw Speech Corpus. cover page

Punjabi Raw Speech Corpus

Telugu Raw Speech Corpus. cover page

Telugu Raw Speech Corpus

Urdu Raw Speech Corpus. cover page

Urdu Raw Speech Corpus

Malayalam Raw Speech Corpus cover page

Malayalam Raw Speech Corpus

Manipuri Raw Speech Corpus cover page

Manipuri Raw Speech Corpus

Maithili Raw Speech Corpus Vol II

Dogri Sentence Aligned Speech Corpus

Manipuri Sentence Aligned Speech Corpus (Bengali Script)

Manipuri Sentence Aligned Speech Corpus (Meetei Mayek)

Punjabi Sentence Aligned Speech Corpus

Telugu Sentence Aligned Speech Corpus

Maithili Sentence Aligned Speech Corpus (Tirhuta Script)

Text

A Gold Standard Chhattisgarhi Raw Text Corpus cover page

A Gold Standard Chhattisgarhi Raw Text Corpus

A Gold Standard Assamese Raw Text Corpus cover page

A Gold Standard Assamese Raw Text Corpus

A Gold Standard Bengali Raw Text Corpus cover page

A Gold Standard Bengali Raw Text Corpus

A Gold Standard Bodo Raw Text Corpus. cover page

A Gold Standard Bodo Raw Text Corpus.

A Gold Standard Dogri Raw Text Corpus. cover page

A Gold Standard Dogri Raw Text Corpus.

A Gold Standard Gujarati Raw Text Corpus cover page

A Gold Standard Gujarati Raw Text Corpus

A Gold Standard Hindi Raw Text Corpus. cover page

A Gold Standard Hindi Raw Text Corpus.

A Gold Standard Kannada Raw Text Corpus. cover page

A Gold Standard Kannada Raw Text Corpus.

A Gold Standard Kashmiri Raw Text Corpus. cover page

A Gold Standard Kashmiri Raw Text Corpus.

A Gold Standard Konkani Raw Text Corpus. cover page

A Gold Standard Konkani Raw Text Corpus.

A Gold Standard Maithili Raw Text Corpus. cover page

A Gold Standard Maithili Raw Text Corpus.

A Gold Standard Malayalam Raw Text Corpus. cover page

A Gold Standard Malayalam Raw Text Corpus.

A Gold Standard Manipuri Raw Text Corpus. cover page

A Gold Standard Manipuri Raw Text Corpus.

A Gold Standard Marathi Raw Text Corpus. cover page

A Gold Standard Marathi Raw Text Corpus.

A Gold Standard Nepali Raw Text Corpus. cover page

A Gold Standard Nepali Raw Text Corpus.

A Gold Standard Odia Raw Text Corpus. cover page

A Gold Standard Odia Raw Text Corpus.

A Gold Standard Tamil Raw Text Corpus. cover page

A Gold Standard Tamil Raw Text Corpus.

A Gold Standard Telugu Raw Text Corpus. cover page

A Gold Standard Telugu Raw Text Corpus.

A Gold Standard Urdu Raw Text Corpus. cover page

A Gold Standard Urdu Raw Text Corpus.

A Gold Standard Punjabi Raw Text Corpus. cover page

A Gold Standard Punjabi Raw Text Corpus.

The Mother Tongue Parallel Text Corpus of India Vol I

A Gold Standard Rajasthani Raw Text Corpus

A Gold Standard Chhattisgarhi Raw Text Corpus Vol II

A Gold Standard Kashmiri Raw Text Corpus Vol II

A Gold Standard Maithili Raw Text Corpus Vol II

A Gold Standard Telugu Raw Text Corpus Vol II

Tools

Anuvadika

Lipyantara

Lipidha

Shabd Sandhan

Dhvani Parivartak

Dhvani Parivartak

AnuLekhika

AnuVachika