Knowledge Sharing Event - 1
Morphological Analysers and Generators
22 - 23 March, 2010
Language technology work carried out in Indian languages till date, including dictionaries, thesauri, wordnets, morphological analyzers and generators, spell checkers, POS tagging, is primarily at the level of words. Sentences show rich and varied structures and it is very important to be able to analyze, capture and utilize the syntactic structure of natural language sentences. Going beyond words and handling relevant linguistic phenomena at the level of sentences is essential for advanced language technology applications such as automatic translation, question-answering, automatic summarization etc. A computational grammar must be so precise that a computing machine can mechanically apply the grammar for parsing and generation.
In this context, the challenge is to develop computational grammars that do not require commonsense or world knowledge. Also, grammars meant for human users normally talk only about irregular cases, exceptions etc., assuming that the readers already know all the usual basic rules. Unlike this, a computational grammar must be comprehensive and should include extensive and thorough knowledge of all possible real-world grammatical sentences, both simple and complex.
Developing computational grammars involve the following tasks and subtasks which can be accomplished in phases:
Task 1: POS tagging
Task 2: Building Electronic Dictionary
Task 3: Developing Morphological analyzer and generator
Task 4: Semantic tagging
Task 5: Building Chunker
Task 6: Tree banking
Task 7: Shallow and deep parsing.
The LDC-IL, in this context is organising a series of events to bring together researchers working in these areas, to share their knowledge with other researchers. Each of the events in the series will focus on one aspect of computational grammar at a time and thereby address issues related to these themes.
The development of core technologies has an important role to play within the context of Natural Language Processing (NLP). Morphological Analysis and Generation are important components for building computational grammars. As small parts of NLP, Analysis and Generation allow for greater efficiency and accuracy in the development of various language applications.
Morphological Analyzer is a program for analyzing the morphology of an input word; the analyzer reads the inflected surface form of each word in a text and provides its lexical form. Generation is the inverse process. The Morphological Generator takes a lemma/root and a morpho-lexical description of the intended word-form as input and it generates a word-form.
Both Analysis and Generation make use of lexicon, thesaurus, stop lists, and indexing engines. There are various approaches for morphological analysis and generation, including rule based methods and machine learning statistical method. The processing methods of morphological analyzer and generator vary in accordance with the characteristics of the target language.
Extensive work is being carried out in the area of NLP, both, in India and abroad, towards building high quality analysers and generators for various languages. Building analysers and generators demands the synergy of theoretical analysis and processing of language; which poses a wide range of challenges both linguistically as well as computationally.
The two day event aims to provide a platform for researchers in this area to bring together diverse contributions from theoretical and computational linguistics, which explore newer approaches to morphological analysis and generation, and thereby share their experiences and learn from one another.
The goal of the event is to inform researchers as to the current projects and studies that identify, investigate, and (attempt to) resolve issues that relate to morphological analysis and generation, as well as provide researchers the opportunity to meet, interact and present their work among the broader community of NLP researchers.
Abstracts of no more than 500 words (including references) mentioning the title of the paper, author(s) name, institutional affiliation, and email address should be submitted by the due date in *.rtf or *.pdf format to the following email address:
Last date for submitting the Abstract : 28th December, 2009
Abstract acceptance notification : 31st December, 2009
Last date for submitting full length Papers : 2nd March, 2010
Minimal financial support for travel shall be provided to the authors of a few selected papers.