Multi Block Transformer for Malayalam Language Modeling

Journal of Artificial Intelligence & Cloud Computing

Multi Block Transformer for Malayalam Language Modeling

Author(s): Rohit TP*, Sasi Gopalan and Varsha Shaheen

In this research, we present a novel neural network architecture for natural language generation, specifically designed for Malayalam text. We have adapted the Transformer architecture which is commonly used in language modeling and extended it to work in non-Latin languages. To evaluate the effectiveness of our model, we trained it on a large corpus of Malayalam text and fine-tuned the hyper-parameters using a grid search. Our model achieved a significant improvement in generating coherent and grammatically correct Malayalam text compared to the state-of-the-art models. The model was able to generate text after just 4000 iterations and was able to effectively generalize the relation between symbols and alphabets of the language within 8000 training iterations. The transformer architecture used proved to be highly eﬀicient in language modeling. Our work highlights the importance of developing new model architectures for text generation in complex and rich languages and opens up new avenues for future research in this area.

View PDF

Journal Menu

Journal Home Aims and Scope Call for Papers Editorial Board Inpress Current Issue Archive Journal Guidelines Submit Manuscript

Our Pubmed Indexed Articles

Detecting Peripheral Neuropathy in Patients with Diabetes, Prediabetes and other High-Risk Conditions: An Advanced Practice Nurse’s Perspective

PMID: 35445219

An Analysis of Peripheral Neuropathy Symptom Characteristics in HIV

PMID: 35174365

Overview of Neurotrauma and Sensory Loss

PMID: 35692955

A mobile app providing individually-tailored psychoeducation about sleep for older adults with chronic health conditions and low health literacy

PMID: 38770111