Research Article
BibTex RIS Cite

Development of an Artificial Intelligence Based Correction System for Spelling Errors in Product Reviews

Year 2024, , 99 - 108, 31.12.2024
https://doi.org/10.70030/sjmakeu.1577809

Abstract

E-commerce has experienced rapid growth in recent years and continues to expand dynamically. In this sector, maximizing customer satisfaction and enhancing the shopping experience are recognized as important strategic initiatives. To maximize customer satisfaction, it is essential to accurately determine customer needs and provide appropriate solutions to meet demand. In this context, feedback obtained from customers holds significant importance. However, customer comments often contain spelling errors, complicating the analysis of these comments. This study aims to automatically correct spelling errors in user comments regarding products sold on the e-commerce site Trendyol.com. For this purpose, a system based on transformer architecture has been created. Various spelling error detection and correction models were subsequently developed based on this architecture. Prediction models have been developed using two separate datasets consisting of Trendyol user comments and two additional datasets, including the Turkish Spelling Check Dataset taken from the Hunspell library, and the effects of these four datasets on prediction performance have been examined. The success of the models has been evaluated using the Accuracy metric. The performance of the developed models was also compared with that of the model in the Zemberek library. As a result of the study, it has been observed that the utilization of the Turkish Spelling Check Dataset positively influences prediction performance. The developed system enhanced customer experience by correcting spelling errors in comments.

References

  • Sanbella, L., Van Versie, I., & Audiah, S. (2024). Online Marketing Strategy Optimization to Increase Sales and E-Commerce Development: An Integrated Approach in the Digital Age. Startupreneur Business Digital (SABDA Journal), 3(1), 54-66.
  • Wang, Z., Zhu, Y., He, S., Yan, H., & Zhu, Z. (2024). LLM for sentiment analysis in e-commerce: A deep dive into customer feedback. Applied Science and Engineering Journal for Advanced Research, 3(4), 8-13.
  • Aytan, B., & Şakar, C. O. (2023). Deep learning-based Turkish spelling error detection with a multi-class false positive reduction model. Turkish Journal of Electrical Engineering and Computer Sciences, 31(3), 581-595.
  • Dutta, A., Polushin, G., Zhang, X., & Stein, D. (2024). Enhancing E-commerce Spelling Correction with Fine-Tuned Transformer Models. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 4928-4938). https://doi.org/10.1145/3637528.3671625
  • Isbarov, J., Huseynova, K., & Rustamov, S. (2024, April). Robust automated spelling correction with deep ensembles. In Proceedings of the 2024 8th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence (pp. 26-30). https://doi.org/10.1145/3665065.3665070
  • Kakkar, V., Sharma, C., Pande, M., & Kumar, S. (2023, July). Search Query Spell Correction with Weak Supervision in E-commerce. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track) (pp. 687-694). https://doi.org/10.18653/v1/2023.acl-industry.66
  • Kuznetsov, A., & Urdiales, H. Spelling correction with denoising transformer. arXiv 2021. arXiv preprint arXiv:2105.05977.
  • Naziri, A., & Zeinali, H. (2024). A Comprehensive Approach to Misspelling Correction with BERT and Levenshtein Distance. arXiv preprint arXiv:2407.17383.
  • Oral, E., Mancuhan, K., Erdem, H. V., & Hatipoglu, P. E. (2024, May). Turkish Typo Correction for E-Commerce Search Engines. In Proceedings of the Seventh Workshop on e-Commerce and NLP@ LREC-COLING 2024 (pp. 65-73).
  • Pankam, I., Limkonchotiwat, P., & Chuangsuwanich, E. (2023, June). Two-stage Thai Misspelling Correction based on Pre-trained Language Models. In 2023 20th International Joint Conference on Computer Science and Software Engineering (JCSSE) (pp. 7-12). IEEE.
  • Ratnam, D. J., Karthika, A. N., Praveena, K., Taniya, R., Thara, S., & Prema, N. (2024). Phonogram-based Automatic Typo Correction in Malayalam Social Media Comments. Procedia Computer Science, 233, 391-400.
  • Santoso, J. T., & Yan, S. (2024). A Hybrid Approach to Typo Correction in Indonesian Documents Using Levenshtein Distance. Journal of Technology Informatics and Engineering, 3(2), 151-168.
  • Sharma, S., Valls-Vargas, J., King, T. H., Guerin, F., & Arora, C. (2023, July). Contextual multilingual spellchecker for user queries. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 3395-3399).
  • Soyusiawaty, D., & Wolley, D. H. R. (2021). Hybrid spelling correction and query expansion for relevance document searching. International Journal of Advanced Computer Science and Applications, 12(8).
  • Tohidian, F., Kashiri, A., & Lotfi, F. (2022, November). BEDSpell: Spelling Error Correction Using BERT-Based Masked Language Model and Edit Distance. In International Conference on Service-Oriented Computing (pp. 3-14). Cham: Springer Nature Switzerland.
  • Toleu, A., Tolegen, G., Mussabayev, R., Krassovitskiy, A., & Ualiyeva, I. (2022). Data-driven approach for spellchecking and autocorrection. Symmetry, 14(11), 2261.
  • Yanfi, Y., Soeparno, H., Setiawan, R., & Budiharto, W. (2024). Multi-head Attention Based Bidirectional LSTM for Spelling Error Detection in the Indonesian Language. IEEE Access.
  • Phukan, R., Neog, M., & Baruah, N. (2023, July). A Deep Learning Based Approach For Spelling Error Detection In The Assamese Language. In 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT) (pp. 1-7). IEEE.
  • Aziz, R., Anwar, M. W., Jamal, M. H., & Bajwa, U. I. (2021). A hybrid model for spelling error detection and correction for Urdu language. Neural Computing and Applications, 33, 14707-14721.
  • Tien, D. N., Minh, T. T. T., Vu, L. L., & Minh, T. D. (2022). Vietnamese Spelling Error Detection and Correction Using BERT and N-gram Language Model. In Intelligent Systems and Networks: Selected Articles from ICISN 2022, Vietnam (pp. 427-436). Singapore: Springer Nature Singapore.
  • Al-Hussaini, L. (2017). Experience: insights into the benchmarking data of hunspell and aspell spell checkers. Journal of Data and Information Quality (JDIQ), 8(3-4), 1-10.
  • Do, D. T., Nguyen, H. T., Bui, T. N., & Vo, H. D. (2021). Vsec: Transformer-based model for vietnamese spelling correction. In PRICAI 2021: Trends in Artificial Intelligence: 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Hanoi, Vietnam, November 8–12, 2021, Proceedings, Part II 18 (pp. 259-272). Springer International Publishing.
  • Akın, A. A., & Akın, M. D. (2007). Zemberek, an open source NLP framework for Turkic languages. Structure, 10, 1-5.
Year 2024, , 99 - 108, 31.12.2024
https://doi.org/10.70030/sjmakeu.1577809

Abstract

References

  • Sanbella, L., Van Versie, I., & Audiah, S. (2024). Online Marketing Strategy Optimization to Increase Sales and E-Commerce Development: An Integrated Approach in the Digital Age. Startupreneur Business Digital (SABDA Journal), 3(1), 54-66.
  • Wang, Z., Zhu, Y., He, S., Yan, H., & Zhu, Z. (2024). LLM for sentiment analysis in e-commerce: A deep dive into customer feedback. Applied Science and Engineering Journal for Advanced Research, 3(4), 8-13.
  • Aytan, B., & Şakar, C. O. (2023). Deep learning-based Turkish spelling error detection with a multi-class false positive reduction model. Turkish Journal of Electrical Engineering and Computer Sciences, 31(3), 581-595.
  • Dutta, A., Polushin, G., Zhang, X., & Stein, D. (2024). Enhancing E-commerce Spelling Correction with Fine-Tuned Transformer Models. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 4928-4938). https://doi.org/10.1145/3637528.3671625
  • Isbarov, J., Huseynova, K., & Rustamov, S. (2024, April). Robust automated spelling correction with deep ensembles. In Proceedings of the 2024 8th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence (pp. 26-30). https://doi.org/10.1145/3665065.3665070
  • Kakkar, V., Sharma, C., Pande, M., & Kumar, S. (2023, July). Search Query Spell Correction with Weak Supervision in E-commerce. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track) (pp. 687-694). https://doi.org/10.18653/v1/2023.acl-industry.66
  • Kuznetsov, A., & Urdiales, H. Spelling correction with denoising transformer. arXiv 2021. arXiv preprint arXiv:2105.05977.
  • Naziri, A., & Zeinali, H. (2024). A Comprehensive Approach to Misspelling Correction with BERT and Levenshtein Distance. arXiv preprint arXiv:2407.17383.
  • Oral, E., Mancuhan, K., Erdem, H. V., & Hatipoglu, P. E. (2024, May). Turkish Typo Correction for E-Commerce Search Engines. In Proceedings of the Seventh Workshop on e-Commerce and NLP@ LREC-COLING 2024 (pp. 65-73).
  • Pankam, I., Limkonchotiwat, P., & Chuangsuwanich, E. (2023, June). Two-stage Thai Misspelling Correction based on Pre-trained Language Models. In 2023 20th International Joint Conference on Computer Science and Software Engineering (JCSSE) (pp. 7-12). IEEE.
  • Ratnam, D. J., Karthika, A. N., Praveena, K., Taniya, R., Thara, S., & Prema, N. (2024). Phonogram-based Automatic Typo Correction in Malayalam Social Media Comments. Procedia Computer Science, 233, 391-400.
  • Santoso, J. T., & Yan, S. (2024). A Hybrid Approach to Typo Correction in Indonesian Documents Using Levenshtein Distance. Journal of Technology Informatics and Engineering, 3(2), 151-168.
  • Sharma, S., Valls-Vargas, J., King, T. H., Guerin, F., & Arora, C. (2023, July). Contextual multilingual spellchecker for user queries. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 3395-3399).
  • Soyusiawaty, D., & Wolley, D. H. R. (2021). Hybrid spelling correction and query expansion for relevance document searching. International Journal of Advanced Computer Science and Applications, 12(8).
  • Tohidian, F., Kashiri, A., & Lotfi, F. (2022, November). BEDSpell: Spelling Error Correction Using BERT-Based Masked Language Model and Edit Distance. In International Conference on Service-Oriented Computing (pp. 3-14). Cham: Springer Nature Switzerland.
  • Toleu, A., Tolegen, G., Mussabayev, R., Krassovitskiy, A., & Ualiyeva, I. (2022). Data-driven approach for spellchecking and autocorrection. Symmetry, 14(11), 2261.
  • Yanfi, Y., Soeparno, H., Setiawan, R., & Budiharto, W. (2024). Multi-head Attention Based Bidirectional LSTM for Spelling Error Detection in the Indonesian Language. IEEE Access.
  • Phukan, R., Neog, M., & Baruah, N. (2023, July). A Deep Learning Based Approach For Spelling Error Detection In The Assamese Language. In 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT) (pp. 1-7). IEEE.
  • Aziz, R., Anwar, M. W., Jamal, M. H., & Bajwa, U. I. (2021). A hybrid model for spelling error detection and correction for Urdu language. Neural Computing and Applications, 33, 14707-14721.
  • Tien, D. N., Minh, T. T. T., Vu, L. L., & Minh, T. D. (2022). Vietnamese Spelling Error Detection and Correction Using BERT and N-gram Language Model. In Intelligent Systems and Networks: Selected Articles from ICISN 2022, Vietnam (pp. 427-436). Singapore: Springer Nature Singapore.
  • Al-Hussaini, L. (2017). Experience: insights into the benchmarking data of hunspell and aspell spell checkers. Journal of Data and Information Quality (JDIQ), 8(3-4), 1-10.
  • Do, D. T., Nguyen, H. T., Bui, T. N., & Vo, H. D. (2021). Vsec: Transformer-based model for vietnamese spelling correction. In PRICAI 2021: Trends in Artificial Intelligence: 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Hanoi, Vietnam, November 8–12, 2021, Proceedings, Part II 18 (pp. 259-272). Springer International Publishing.
  • Akın, A. A., & Akın, M. D. (2007). Zemberek, an open source NLP framework for Turkic languages. Structure, 10, 1-5.
There are 23 citations in total.

Details

Primary Language English
Subjects Natural Language Processing
Journal Section Original Research Articles
Authors

Okan Çiftçi 0000-0002-9435-8980

Sumru Nayir 0009-0003-4782-3063

Emre Tolga Ayan 0000-0002-4894-2190

Ceren Ulus 0000-0003-2086-6381

Mehmet Fatih Akay 0000-0003-0780-0679

Early Pub Date December 3, 2024
Publication Date December 31, 2024
Submission Date November 4, 2024
Acceptance Date November 11, 2024
Published in Issue Year 2024

Cite

APA Çiftçi, O., Nayir, S., Ayan, E. T., Ulus, C., et al. (2024). Development of an Artificial Intelligence Based Correction System for Spelling Errors in Product Reviews. Scientific Journal of Mehmet Akif Ersoy University, 7(2), 99-108. https://doi.org/10.70030/sjmakeu.1577809