AI translation tools support India’s indigenous language policy

While English has been the main language of instruction in higher education in India, many students from non-English speaking backgrounds struggle to understand study materials, which impedes their learning experience and affects academic performance.

Earlier this year India’s higher education regulator, the University Grants Commission (UGC), asked universities to allow students to write exams in local languages even if the course is taught in English.

Given that the government now encourages education in local languages, the demand for translation of English books in Indic languages is also gaining momentum. The UGC stressed the need to strengthen and “promote such initiatives as writing textbooks in the mother tongue/local languages and encouraging their use in teaching, including translating standard books from other languages”.

With the emergence of effective generative AI tools such as ChatGPT, the UGC issued new guidelines in July for translation of higher education textbooks using artificial intelligence tools. The guidelines were issued to ensure consistency and maintain the quality of translated textbooks at both national and regional levels. They were also timely, and recognised the role such tools can play.

In a country as diverse as India there are 22 official languages and 270 mother tongues, according to the National Census. The use of AI tools, particularly generative AI writing tools such as OpenAI’s ChatGPT, with the ability to translate and write human-like text, could become a game-changer for students, teachers and other stakeholders, educators said.

Experts say AI-assisted translation has advantages not just for students but also for higher education institutions as it is cost-effective for institutions wishing to promote multilingual education.

Projects for AI translation of study materials

Translating study materials into multiple languages usually involves a lot of time, effort and resources, but AI translation can simplify the process significantly.

Many top engineering colleges in India are undertaking generative AI research projects to figure out how the technology can be used in creating tools in Indian languages, similar to Open AI’s ChatGPT.

The Indian Institute of Technology (IIT) Bombay, under the guidance of Professor Ganesh Ramakrishnan of the department of computer science and engineering, launched Project Udaan, which enables translation from English to all Indian languages.

Project Udaan is described as an ‘end-to-end ecosystem’. It translates scientific and technical content from English to all other Indian languages through an AI-based translation ecosystem, Ramakrishnan explained.

Ramakrishnan told University World News: “It saves a lot of time and effort. Domain [connected computer systems] and linguistics experts working manually are likely to take considerably more time.

“But Project Udaan enables translation of textbooks and learning materials from English to Hindi and all Indian languages in almost one-sixth of the time.”

IIT Madras in Chennai, in Tamil Nadu state, has also been working on AI-assisted translation of YouTube videos into Tamil, for example.

In July it announced the launch of an institute specifically dedicated to the development of language technology for Indian languages, after receiving a grant of more than US$4.5 million from Nilekani Philanthropies, the foundation run by Indian entrepreneur Nandan Nilekani, co-founder of Indian tech giant Infosys.

“While language technology has significantly improved for English and a few languages, Indian languages are lagging behind. The focus of the centre would be to bridge this gap,” said Mitesh Khapra, professor of computer science and engineering at IIT Madras.

India’s regulator for technical education, the All India Council for Technical Education (AICTE), is in the process of translating bachelor of technology textbooks into Indian languages, as many engineering colleges, which come under AICTE’s remit, have begun to offer courses in Indian languages.

AICTE has developed ‘Anuvadini’, an AI-powered translation tool specifically for Indian languages, which higher education institutions can use for translation.

The UGC’s guidelines on AI translation point to Anuvadini’s many benefits, such as translation in the source text file, speech-to-text typing etc. All editing facilities are integrated into the tool so that AI-assisted translation can be followed by further manual editing using the tool.

According to Panchanan Mohanty, professor in the school of languages and literature at Nalanda University in the eastern state of Bihar, AI translations nonetheless need to be authenticated by trained professionals.

“A machine can only do raw translation because a machine has no intelligence. We can write a programme and give it to the machine and the machine does whatever it can do,” Mohanty told University World News.

Lack of language data

Mohanty added, however, that the main issue with machine translations is that “exhaustive” dictionaries are not available for all Indian languages.

The Indian government’s Commission for Scientific and Technical Terminology, under the Ministry of Education’s department of higher education, has already developed glossaries in 10 Indian languages for various subjects, which can be used for translation or creating new words or terms.

But most AI-assisted language models use local language newspapers and state government language data repositories to train. However, these are not sufficient for accuracy in some languages.

In its guidelines, the UGC recommends that complex technical terms can be presented in English within brackets after their Indian language equivalents. In the absence of Indian language equivalents, English words can be transliterated in the target language script as substitutes.

The UGC further highlights the importance of promoting pan-Indian common core vocabulary across all Indian languages, facilitating easier translation and comprehension among speakers of different languages.

While word-to-word translations are not mandatory, the translated texts should effectively convey the complete meaning and concepts present in the original text. Thorough editing is essential, involving the careful reading of both the original and translated documents, according to the UGC guidelines.

Use of ChatGPT

Indian researchers have been working on translation projects for more than three years and quality is highly variable, depending on the target indigenous language.

Even with sufficient language data, developing new tools for translating teaching and learning materials can be costly in terms of research time, and it also requires significant computing power. Without powerful computers, the training of large language models to improve accuracy is hampered.

Experts at IIT-Delhi noted that no academic institutions, or even smaller tech companies, have the resources and computing power for regular training of such language models.

In the absence of effective local language substitutes, ChatGPT has emerged as an effective tool in India’s education sector. Its sophisticated capabilities have been used to develop several ground-breaking applications that have revolutionised the traditional learning experience for students.

In May, Amity University Online launched an Artificial Intelligence Professor ‘AMI’ for its online higher education platform.

Powered by ChatGPT4 and open AI-driven technology, Professor AMI is designed to personalise the learning experience “for every student of Amity Online based on their strengths, areas of improvement, and learning styles”, according to the university.

But many educators also point to challenges in using ChatGPT. The quality of content produced by ChatGPT has to be checked to ensure high standards. Apprehension has also been expressed that ChatGPT-based applications can replace teachers. But academics believe it can be useful as a tool to assist teachers in teaching, rather than replacing them.