International Institute of Information Technology (IIIT) Hyderabad is all geared to release what it calls the Indian Language Machine Translator. The project headed by Prof Rajeev Sangal, Director of the institute, is being carried out within three in-house labs, with each of them working on different aspects of Natural Language Processing (NLP).
The research at IIIT dealt with several aspects of text and voice of NLP distinctly and differently. The application is in finishing stages for ten Indian languages: Hindi, Punjabi, Marathi, Bengali, Urdu, Malayalam, Tamil, Telugu and Kannada. This product is expected to be ready for commercial use within a year and is targeted at two distinct areas-Pilgrimage & Tourism and Health. Prof Sangal explains, “We decided on tailoring the application for Pilgrimage and Tourism based on usage trends, the Health application comes with a potential social impact. An ideal example would be a man from Punjab wanting to take his family for a holiday in Kerala. He should be able to access an online forum, post his query in Hindi or Punjabi, and a Keralite at the other end will view this query in Malayalam, reply in Malayalam and our man would see the answer in Punjabi.”
Though language translation has been tried out globally for research, development and commercial deployment, it has failed to overcome the challenges of dialects, grammar changes and colloquialism. The research at IIIT has understood that Indian scripts are sophisticated but not complex and is incorporating Artificial Intelligence to language translation.
Professor, IIIT Hyderabad
Prof Sangal, who is also an AI expert embarked on a study of the Panini Vyakyaran and correlated it to modern Indian literature before adding elements of AI to enable the computer to learn, condition itself and understand the requirement of the user. The AI aspect is divided into two components: Rule-based processing and machine learning. In rule-based processing, an electronic catalogue of words and phrases is fed to the computer, enabling it to understand typical usage of grammar elements. As more and more words and phrases are added to this catalogue, the computer forms more rigid understanding of rules according to which a particular language operates. Machine learning on the other hand focuses on providing statistical data based on which the computer will learn to use the examples. While usual translation software translates phrases from one language to another, IIIT's research focuses on dependency of each word on the neighboring one to create a pattern of usage for each language. This takes care of dialects, localized modifications in language and slangs. In broad terms, this real-time language transliteration application works on a three step methodology-analyze, transfer and generate.