HINDI ANNOTATED SPEECH CORPORA |
Language Data Type |
: |
Speech Corpus |
Language |
: |
Hindi |
Linugality |
: |
Monolingual |
Description |
: |
This Hindi Speech Recognition database was collected in Uttar Pradesh and Bihar and contains the voices of 650 different native speaker who were selected according to age distribution (16-20,21-50,51+), Gender, Dialectical Regions and environment ( home, office and public place). Each speaker recorded a news text in a noisy environment through recorder having an inbuilt microphone. The recordings are in stereo recording and the extracted channel are also included in the specific files. It includes audio file, text file, NIST files which were saved as .ZIP Files. All the speech data are transcribed and labeled at the sentence level. |
Area(s) |
: |
Form and Function words, Command and Control words, Phonetically balanced vocabulary, Proper names, most frequent 1000 words. |
Developed by organization(s), person(s) or within project(s) |
: |
Linguistic Data Consortium for Indian Languages |
Sample Downloads URL |
: |
Annotated Data
SENTENCES
|
Annotation Validation |
: |
The whole data has been validated and proof read manually. |
Technological method(s) |
: |
Speech Recognition |
Contact |
: |
ramamoorthy@ciil.org |
Suggested purpose |
: |
This database is made for the tuning and testing purpose of speech recognition system for ASR applications. |
Distributor |
: |
Linguistic Data Consortium for Indian Languages |
Feedback |
: |
|
|