Assamese and BODO next Word Detection using LSTM

Jump To References Section

Authors

  • Assistant Professor, Department of Computer Science and Technology, Bodoland University Kokrajhar, Assam 783370. ,IN
  • Department of Computer Science and Technology Bodoland University Kokrajhar, Assam,783370. ,IN
  • Professor, Department of Computer Science and Engineering, Assam University Silchar, Assam 788011. ,IN

DOI:

https://doi.org/10.18311/jmmf/2023/34159

Keywords:

Assamese Language, Bodo Language, LSTM, NLP, Rnn

Abstract

The official language of the Indian state of Assam is Assamese, an Eastern Indo-Aryan language. Assamese, the sole native Indo-Aryan language in the Assam Valley, has been heavily impacted by the nearby Tibeto-Burman languages in terms of lexicon, phonetics, and grammar. Its grammar is renowned for its highly inflected forms, and both honorific and non-honorific formulations can include a variety of pronouns and plural nouns. Additionally closely linked to Bengali, Assamese lacks grammatical gender distinctions like Oriya and Bengali. On the other hand, The Bodo language is a variety of dialects of the Tibeto-Burman branch of the Sino-Tibetan languages. Assam, Meghalaya, and Bangladesh are all home to speakers of the Bodo language, which is spoken in northeastern India. It shares linguistic kinship with the Dimasa, Tripura, and Lalunga languages and is written in Bengali, Latin, and Devanagari scripts.

Another name for Next Word Prediction is Language Modeling. Predicting what word will be spoken right away requires commitment. It is one of the primary functions of NLP and has a wide range of uses. Our objective is to create this model as quickly and efficiently as possible. RNNs can interpret prior material and forecast words since they have a lengthy short-term memory, which might be useful for users when building sentences. This method creates words by using letter-by-letter prediction, or letter-by-letter prediction. Users can benefit from next word prediction, which makes typing faster and more accurate. The Assamese and Bodo languages rely on next word prediction since multiple characters can be created by pressing the same consonants combined with different vowels, vowel combinations, and special keys. As a result, we present a Long Short Term Memory (LSTM) network model for Assamese and Bodo next word prediction. With 63,300 point sentences, we test the suggested network model, and it achieves 96 per cent accuracy. In addition, we contrasted the suggested model with cutting-edge models like the LSTM. The proposed network model offers a promising outcome, according to experimental findings.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Downloads

Published

2023-07-04

How to Cite

Das, A., Baruah, A., & Roy, S. (2023). Assamese and BODO next Word Detection using LSTM. Journal of Mines, Metals and Fuels, 71(5), 614–618. https://doi.org/10.18311/jmmf/2023/34159

Issue

Section

Articles

 

References

J. Yang, H. Wang and K. Guo, (2020): “Natural Language Word Prediction Model Based on Multi-Window Con-volution and Residual Network,” in IEEE Access, vol. 8, pp. 188036-188043, doi: 10.1109/AC-CESS.2020.3031200.

K. Terada and Y. Watanobe, (2019): “Code Completion for Programming Education based on Recurrent Neural Network,” 2019 IEEE 11th International Workshop on Computational Intelligence and Applications (IWCIA), Hiroshima, Japan, pp.109-114, doi: 10.1109/IW-CIA47330.2019.8955090.

Habib, Md AL-Mamun, Abdullah Rahman, Md Sid-diquee, Shah Ahmed, Farruk. (2018): An Exploratory Approach to Find a Novel Metric Based Optimum Lan-guage Model for Automatic Bangla Word Prediction. International Journal of Intelligent Systems and Appli-cations. 2. 47-54. 10.5815/ijisa.2018.02.05.

Partha Pratim Barman, Abhijit Boruah, (2018): A RNN based Approach for next word prediction in As-samese Phonetic Transcription, Procedia Computer Sci-ence, Volume 143, Pg. 117-123, ISSN 1877-0509, https:// doi.org/10.1016/j.procs.2018.10.359.

Al-Mubaid, Hisham. (2007): A Learning- Classification Based Approach for Word Prediction. Int. Arab J. Inf. Technol. 4. 264-271.

Pratim Barman Partha, Boruah Abhijit, (2018): “A RNN based Approach for next word prediction in Assamese Phonetic Transcription,” Procedia Computer Science, vol. 43, pp. 117–123.

Terada Kenta and Watanobe Yutaka, (2021): “Code completion for programming education based on deep learning,” Int. J. Computational Intelligence Studies, vol. 10, no. 2–3, pp. 109–114.

Sourabh Ambulgekar, Sanket Malewadikar, Raju Garande, and Dr. Bharti Joshi, (2021): Next Words Prediction Using Recurrent NeuralNetworks’’, Ramarao Adik Institute of Technology Mumbai, Maharashtra, ITM Web of Conferences 40, 03034.

Omor Faruk Rakib, Shahinur Akter, Md Azim Khan, Amit Kumar Das, Khan Mohammad Habibullah, (2019): “Bangla Word Prediction and Sentence Completion Using GRU: An Extended Version of RNN on N-gram Language Model’’, Department of Computer Science and Engineering East-West University Dhaka, Bangladesh, 2019 International Conference on Sustainable Technologies for Industry 4.0 (STI), 24-25 December, Dhaka.

Afika Rianti, Suprih Widodo, Atikah Dhani Ayuningtyas, Fadlan Bima Hermawan, (2022): “Next Word Prediction using LSTM’’, Journal of Information Technology and its Utilization, Vol.5, issue 1, june, eissn 2654-802x.

Jordan, Michael I., and Tom M. Mitchell. (2015): “Machine learning: Trends, perspectives, and prospects.” Science 349, no. 6245: 255-260.

Sahoo, Abhaya Kumar, Chittaranjan Pradhan, and Himansu Das. (2020): “Performance evaluation of different machine learning methods and deep-learning based convolutional neural network for health decision making.” In Nature inspired computing for data science, pp. 201-212. Springer, Cham.

Prajapati, Gend Lal, and Rekha Saha (2019): “REEDS: Relevance and enhanced entropy based Dempster Shafer approach for next word prediction using language model.” Journal of Computational Science 35 : 1-11.

Stremmel, Joel, and Arjun Singh. (2021): “Pretraining federated text models for next word prediction.” In Future of Information and Communication Conference, pp. 477-18. 488. Springer, Cham.

Xiaoyun, Qu, Kang Xiaoning, Zhang Chao, Jiang Shuai, and Ma Xiuda (2016): “Short-term prediction of wind power based on deep long short-term memory.” In 2016 IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC), pp. 1148-1152. IEEE, 2016.