Teena Verma | The Benchmarking Conference | LDC-IL

Benchmarking MT Systems: A Public Tool for the Evaluation of MT Systems

Teena Verma

Student
The English and Foreign Languages
University Hyderabad


Authors : Dr Narayan Choudhary, Greeshma Mahesh, Teena Verma, Shree Vaishnavi and Stephen Fernandez

Abstract

Machine Translation (MT) systems have become essential in bridging language barriers, with artificial intelligence enhancing their accuracy and contextual relevance. These AI-driven MT systems offer real-time translations, adapt to user preferences, and continuously improve through machine learning. However, they are not infallible and can produce inaccurate or culturally inappropriate translations. This paper showcases the development of a web-based, publicly available tool to evaluate MT system output in any given language pair. This paper comprehensively analyses MT systems using three evaluation metrics: BLEU, chrF++, and METEOR. We review existing MT evaluation metrics and identify gaps in current research, highlighting the need for an automated evaluation system in the age of AI. The study tests these metrics on translation data for 13 Indian languages from Bhashini, Google, and Bing. Our research reveals that existing MT systems report lower scores for low-resource languages due to data scarcity. Additionally, the BLEU metric shows comparatively lower scores due to its overweight dependency on word-level matches and inability to account for synonyms or paraphrases. To address these concerns, we have developed a website that allows users to upload MT-generated translations and calculate their accuracy. This tool is crucial in demonstrating the limitations of current translation technologies. The scope of this research is limited to the implementation of pre-existing string-based metrics for Indian languages. Future research should aim to develop metrics focusing more on overall contextual meaning rather than being constrained by word-level embeddings.

Keywords: Machine Translation, Evaluation Metrics, BLEU, chrF++, METEOR