Knowledge Sharing Event - 2
Making of Electronic Dictionary
24 March, 2010
Language technology work carried out in Indian languages till date, including dictionaries, thesauri, wordnets, morphological analyzers and generators, spell checkers, POS tagging, is primarily at the level of words. Sentences show rich and varied structures and it is very important to be able to analyze, capture and utilize the syntactic structure of natural language sentences. Going beyond words and handling relevant linguistic phenomena at the level of sentences is essential for advanced language technology applications such as automatic translation, question-answering, automatic summarization etc. A computational grammar must be so precise that a computing machine can mechanically apply the grammar for parsing and generation.
In this context, the challenge is to develop computational grammars that do not require commonsense or world knowledge. Also, grammars meant for human users normally talk only about irregular cases, exceptions etc., assuming that the readers already know all the usual basic rules. Unlike this, a computational grammar must be comprehensive and should include extensive and thorough knowledge of all possible real-world grammatical sentences, both simple and complex.
Developing computational grammars involve the following tasks and subtasks which can be accomplished in phases:
Task 1: POS tagging
Task 2: Building Electronic Dictionary
Task 3: Developing Morphological analyzer and generator
Task 4: Semantic tagging
Task 5: Building Chunker
Task 6: Tree banking
Task 7: Shallow and Deep parsing.
The LDC-IL, in this context is organising a series of events to bring together researchers working in these areas, to share their knowledge with other researchers. Each of the events in the series will focus on one aspect of computational grammar at a time and thereby address issues related to these themes.
With the technological advancement, digital dictionary (e-dictionary as well as on-line and desktop dictionary) is emerging as a standard format of dictionary dwindling the use of the traditionally valued paper dictionary. Apart from the traditional look-up use for spelling, meaning and usage, the digital dictionary has gained additional uses in the emerging domains of language use like for spell checker, grammatical use, etc. In the newer domains of language use and research, digital dictionary as well as other digital lexicographical resources (DLR) like WordNet, thesarus, lexipedia, glossary and lexicon (general as well as domain specific), etc. are of immense value. They form a basis of various aspects of NLP. These DLR, constructed either via automatic acquisition (i.e. scanning of the paper dictionary or other resources) or hand-crafted, pose a challenge both lexicographically as well as computationally.
The event aims to address these challenges as well as to share and explore innovative developments in the field of digital lexicography. The event also aims to provide an academic space to share and assess new methodologies, the latest information, tools and techniques, theories and technology in the field of digital lexicography in the spirit of promoting the discipline as a consolidated research area, and its wider application.
Abstracts of no more than 500 words (including references) mentioning the title of the paper, author(s) name, institutional affiliation, and email address should be submitted by the due date in *.rtf or *.pdf format to the following email address:
Last date for submitting the Abstract : 27th November, 2009
Abstract acceptance notification : 5th December, 2009
Last date for submitting full length Papers : 1st February, 2010
Minimal financial support for travel shall be provided to the authors of a few selected papers.