Research Article
BibTex RIS Cite

A Novel Framework for Next Word Prediction Using Long-Short Term Memory Networks

Year 2026, Volume: 28 Issue: 82, 113 - 120, 27.01.2026
https://doi.org/10.21205/deufmd.2026288215

Abstract

Natural Language Processing (NLP) has become a cornerstone in various fields, revolutionizing how machines interpret and process human language. Among its diverse applications, next-word prediction emerges as a highly practical and impactful example of generative AI. This research focuses on the use of Long Short-Term Memory (LSTM) models—an innovative class of Recurrent Neural Network (RNN)—for predictive text generation. LSTMs excel in capturing sequential and contextual information, making them ideal for language tasks. While transformer models dominate accuracy benchmarks, this work addresses the critical need for efficient alternatives in resource-constrained deployment scenarios. This study presents a novel LSTM-based framework enhanced with hybrid architecture and advanced regularization techniques, trained on a carefully curated dataset of 15,000 English sentences. The proposed model achieves superior performance with 84.2% training accuracy, 79.6% test accuracy, and a perplexity score of 2.41, significantly outperforming traditional approaches. The methodology addresses overfitting through dropout regularization, batch normalization, and adaptive learning rate strategies while effectively capturing long-term contextual dependencies. This research contributes to the advancement of neural language modeling by providing a robust framework that bridges the gap between computational efficiency and prediction accuracy in real-world NLP applications.

References

  • Chowdhary K, Chowdhary K. Natural language processing. In: Fundamentals of artificial intelligence. 2020, p. 603-49.
  • Yücesan E, Erkan MA, Deveci A, Medeni İT. Bekenbey AI: Innovative Solutions at the Intersection of Deep Learning and Law. CÜMFAD 2024;2(2):185-92.
  • Hong Z. Enabling scientific information extraction with natural language processing. Nature Communications 2024;15.
  • Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation 1997;9(8):1735-80.
  • Devlin J, Chang M, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proceedings of NAACL-HLT. 2019, p. 4171-86.
  • Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training. OpenAI Technical Report; 2018.
  • Chen SF, Goodman J. An empirical study of smoothing techniques for language modeling. Computer Speech & Language 1999;13(4):359-94.
  • Rabiner LR. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 1989;77(2):257-86.
  • Bengio Y, Ducharme R, Vincent P, Jauvin C. A neural probabilistic language model. Journal of Machine Learning Research 2003;3:1137-55.
  • Mikolov T, Karafiát M, Burget L, Černocký J, Khudanpur S. Recurrent neural network based language model. In: Interspeech. 2010;2(3):1045-8.
  • Rianti A, Widodo S, Ayuningtyas AD, Hermawan FB. Next word prediction using lstm. Journal of Information Technology and Its Utilization 2022;5(1):432033.
  • Ganai AF, Khursheed F. Predicting next word using rnn and lstm cells: Statistical language modeling. In: 2019 fifth international conference on image information processing (ICIIP). IEEE; 2019, p. 469-74.
  • Aliprandi C, Carmignani N, Deha N, Mancarella P, Rubino M. Advances in nlp applied to word prediction. Italy: University of Pisa; 2008.
  • Sharma R, Goel N, Aggarwal N, Kaur P, Prakash C. Next word prediction in hindi using deep learning techniques. In: 2019 International conference on data science and engineering (ICDSE). IEEE; 2019, p. 55-60.
  • Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 2018.
  • Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I. Language models are unsupervised multitask learners. OpenAI Blog 2019;1(8):9.
  • Jelinek F, Merialdo B, Roukos S, Strauss M. A dynamic language model for speech recognition. In: Proceedings of the workshop on Speech and Natural Language. 1991, p. 293-5.
  • Papineni K, Roukos S, Ward T, Zhu WJ. Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 2002, p. 311-8.
  • Strubell E, Ganesh A, McCallum A. Energy and policy considerations for deep learning in NLP. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, p. 3645-50.
  • Schwartz R, Dodge J, Smith NA, Etzioni O. Green AI. Communications of the ACM 2020;63(12):54-63.
  • Li E, Zeng L, Zhou Z, Chen X. Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Transactions on Wireless Communications 2019;19(1):447-57.
  • Rogers A, Kovaleva O, Rumshisky A. A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research 2020;57:615-31.

Sonraki Kelime Tahmini İçin Yeni Bir Yöntem: Uzun-Kısa Süreli Bellek Ağlarını Kullanma

Year 2026, Volume: 28 Issue: 82, 113 - 120, 27.01.2026
https://doi.org/10.21205/deufmd.2026288215

Abstract

Doğal Dil İşleme (NLP), makinelerin insan dilini yorumlama ve işleme biçimini kökten değiştirerek birçok alanda temel bir unsur haline gelmiştir. NLP'nin çeşitli uygulamaları arasında, bir sonraki kelimeyi tahmin etme işlevi, üretici yapay zekânın son derece pratik ve etkili bir örneği olarak öne çıkmaktadır. Bu araştırma, metin üretimi için öngörücü bir model olarak Uzun Kısa Süreli Bellek (LSTM) modellerinin kullanımına odaklanmaktadır. LSTM'ler, sıralı ve bağlamsal bilgileri yakalama konusundaki üstünlükleriyle, dil görevleri için ideal olan Yenileyici Sinir Ağları (RNN) sınıfının yenilikçi bir türüdür. Transformer modelleri doğruluk kıyaslamalarında öne çıksa da, bu çalışma kaynakların kısıtlı olduğu dağıtım senaryolarında verimli alternatiflere duyulan kritik ihtiyacı ele almaktadır. .Bu çalışmada, 15,000 İngilizce cümle içeren özel olarak hazırlanmış bir veri seti kullanılarak, hibrit mimari ve gelişmiş regülarizasyon teknikleri ile donatılmış LSTM tabanlı bir model geliştirilmiştir. Model, %84.2 eğitim doğruluğu, %79.6 test doğruluğu ve 2.41 perplexity değeri elde ederek, geleneksel yaklaşımlardan önemli ölçüde üstün performans sergilemiştir. Önerilen yöntem, dropout regularization, batch normalization ve adaptif öğrenme oranı stratejileri kullanarak aşırı öğrenme problemini çözmekte ve uzun bağlamsal bağımlılıkları etkili bir şekilde yakalamaktır.

References

  • Chowdhary K, Chowdhary K. Natural language processing. In: Fundamentals of artificial intelligence. 2020, p. 603-49.
  • Yücesan E, Erkan MA, Deveci A, Medeni İT. Bekenbey AI: Innovative Solutions at the Intersection of Deep Learning and Law. CÜMFAD 2024;2(2):185-92.
  • Hong Z. Enabling scientific information extraction with natural language processing. Nature Communications 2024;15.
  • Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation 1997;9(8):1735-80.
  • Devlin J, Chang M, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proceedings of NAACL-HLT. 2019, p. 4171-86.
  • Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training. OpenAI Technical Report; 2018.
  • Chen SF, Goodman J. An empirical study of smoothing techniques for language modeling. Computer Speech & Language 1999;13(4):359-94.
  • Rabiner LR. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 1989;77(2):257-86.
  • Bengio Y, Ducharme R, Vincent P, Jauvin C. A neural probabilistic language model. Journal of Machine Learning Research 2003;3:1137-55.
  • Mikolov T, Karafiát M, Burget L, Černocký J, Khudanpur S. Recurrent neural network based language model. In: Interspeech. 2010;2(3):1045-8.
  • Rianti A, Widodo S, Ayuningtyas AD, Hermawan FB. Next word prediction using lstm. Journal of Information Technology and Its Utilization 2022;5(1):432033.
  • Ganai AF, Khursheed F. Predicting next word using rnn and lstm cells: Statistical language modeling. In: 2019 fifth international conference on image information processing (ICIIP). IEEE; 2019, p. 469-74.
  • Aliprandi C, Carmignani N, Deha N, Mancarella P, Rubino M. Advances in nlp applied to word prediction. Italy: University of Pisa; 2008.
  • Sharma R, Goel N, Aggarwal N, Kaur P, Prakash C. Next word prediction in hindi using deep learning techniques. In: 2019 International conference on data science and engineering (ICDSE). IEEE; 2019, p. 55-60.
  • Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 2018.
  • Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I. Language models are unsupervised multitask learners. OpenAI Blog 2019;1(8):9.
  • Jelinek F, Merialdo B, Roukos S, Strauss M. A dynamic language model for speech recognition. In: Proceedings of the workshop on Speech and Natural Language. 1991, p. 293-5.
  • Papineni K, Roukos S, Ward T, Zhu WJ. Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 2002, p. 311-8.
  • Strubell E, Ganesh A, McCallum A. Energy and policy considerations for deep learning in NLP. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, p. 3645-50.
  • Schwartz R, Dodge J, Smith NA, Etzioni O. Green AI. Communications of the ACM 2020;63(12):54-63.
  • Li E, Zeng L, Zhou Z, Chen X. Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Transactions on Wireless Communications 2019;19(1):447-57.
  • Rogers A, Kovaleva O, Rumshisky A. A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research 2020;57:615-31.
There are 22 citations in total.

Details

Primary Language English
Subjects Data Communications
Journal Section Research Article
Authors

Ali Deveci 0000-0002-4990-0785

Mehmet Ali Erkan 0009-0007-5760-1914

İhsan Tolga Medeni 0000-0002-0642-7908

Tunç Durmuş Medeni 0000-0002-2964-3320

Submission Date March 27, 2025
Acceptance Date July 4, 2025
Publication Date January 27, 2026
Published in Issue Year 2026 Volume: 28 Issue: 82

Cite

Vancouver Deveci A, Erkan MA, Medeni İT, Medeni TD. A Novel Framework for Next Word Prediction Using Long-Short Term Memory Networks. DEUFMD. 2026;28(82):113-20.

This journal is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).

download?token=eyJhdXRoX3JvbGVzIjpbXSwiZW5kcG9pbnQiOiJmaWxlIiwicGF0aCI6IjliNTAvMDBjMi8xZmIxLzY5MjZmZDIyOGE1NzgyLjA3MzU5MTk2LnBuZyIsImV4cCI6MTc2NDE2OTMzMSwibm9uY2UiOiI2MTU1ODg1NGZlYzhkZTA1OThkNTU2NGFmYTQzYTc0YiJ9.O5b4Ex8bMlFv5797LL8VnE9YWS_X5880dfbmOp2-kc8