Samsung Research and Development Institute India-Bangalore (SRI-B), one of the key innovation centres of the South Korean tech giant in the world, worked with academic partners to develop the Hindi language for its Galaxy AI, the suite of artificial intelligence (AI) features for Samsung devices. The collaboration covered 20 different regional dialects of the language, tonal inflections, and colloquialisms. One of the key partners, the Vellore Institute of Technology (VIT) assisted with sourcing large quantities of data to train the large language model.
SRI-B develops Hindi language for Galaxy AI
In a newsroom post on the Samsung India website, the company highlighted the role played by its R&D institute and academic partners in developing the understanding of the regional language for Galaxy AI. The company said that it is working on expanding Galaxy AI in different global languages to allow more users to use these features in their native language. The language capability enhances some of the on-device features such as Live Translate, Interpreter, Note Assist, and Browsing Assist.
To develop the Hindi language for the AI suite, SRI-B worked together with VIT. The post highlights that the academic institute helped source nearly a million lines of segmented and curated audio data on conversational speech, words, and commands.
The reason for the collaboration was the difficulty in generating high-quality data in Hindi due to the high amount of regional variance. The SRI-B team covered 20 different regional dialects, tonal inflections, punctuation and colloquialisms.
Samsung developed a facility for VIT under its Students Ecosystem for Engineered Data (SEED) Labs initiative in 2021. The company says the lab is equipped with head and torso simulators, binaural microphones, and hearing devices surrounded by an advanced sound absorption system. It also requests projects on which university staff, students, and interns can work alongside technical experts provided by the tech giant.
Notably, SRI-B collaborated with teams across the globe to develop AI language models in British, Indian, Australian English, Thai, Vietnamese, and Indonesian. The research centre’s role was considered pivotal and therefore, it was asked to develop the Hindi language for Galaxy AI.