Review
BibTex RIS Cite

Machine learning use for English texts’ classification (A mini-review)

Year 2024, Volume: 7 Issue: 1, 414 - 423, 22.01.2024
https://doi.org/10.47495/okufbed.1259868

Abstract

Using classification to retrieve information and extract data from text increases the reader's understanding of the content as well. As a result of advances in technology, new methods have been proposed that not only are highly accurate but also are faster than previous methods. Various factors have been used to classify the text to date, including “Sentiment Analysis, Language Detection, Intent Detection, Spam Detection, and Topic Detection”. In English linguistics, one of the most common problems is classifying texts according to their readability level. In this article, authors have reviewed machine learning use in the classification of English text in terms of difficulty and readability level. Moreover, authors attempt to discuss the drawbacks of the deep learning method in terms of accuracy and speed of action.

References

  • Altınel B., Ganiz MC. Semantic text classification: A survey of past and recent advances. Information Processing and Management 2018; 54(6): 1129-1153.
  • Basiri ME., Abdar M., Cifci MA., Nemati S., Acharya UR. A novel method for sentiment classification of drug reviews using fusion of deep and machine learning techniques. Knowledge-Based Systems 2020; 198: 105949.
  • Boser BE., Guyon IM., Vapnik VN. A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory 1992; 144-152.
  • Brunello A., Marzano E., Montanari A., Sciavicco G. J48S: A sequence classification approach to text analysis based on decision trees. In International Conference on Information and Software Technologies 2018; 240-256, Springer, Cham.
  • Cai L., Gu J., Ma J., Jin Z. Probabilistic wind power forecasting approach via instance-based transfer learning embedded gradient boosting decision trees. Energies 2019; 12(1): 159.
  • Cervantes J., Garcia-Lamont F., Rodríguez-Mazahua L., Lopez A. A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing 2020; 408: 189-215.
  • Deng X., Li Y., Weng J., Zhang J. Feature selection for text classification: A review. Multimedia Tools and Applications 2019; 78(3): 3797-3816.
  • Elghazel H., Aussem A., Gharroudi O., Saadaoui W. Ensemble multi-label text categorization based on rotation forest and latent semantic indexing. Expert Systems with Applications 2016; 57: 1-11.
  • Ghareb AS., Bakar AA., Hamdan AR. Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Systems with Applications 2016; 49: 31-47.
  • Harahap F., Harahap AYN., Ekadiansyah E., Sari RN., Adawiyah R., Harahap CB. Implementation of Naïve Bayes classification method for predicting purchase. In 2018 6th International Conference on Cyber and IT Service Management (CITSM) 2018; (pp. 1-5). IEEE.
  • Hirway C., Fallon E., Conolly P., Flanagan K., Yadav D. Determining receipt validity from e-mail subject line using feature extraction and binary classifiers. International Journal of Simulation--Systems, Science and Technology 2022; 23(2).
  • Kavitha M., Prabhavathy P. A review on machine learning techniques for text classification. In 2021 4th International Conference on Computing and Communications Technologies (ICCCT) 2021; (pp. 605-610). IEEE.
  • Kim SB., Han KS., Rim HC., Myaeng SH. Some effective techniques for naive bayes text classification. IEEE Transactions on Knowledge and Data Engineering 2006; 18(11): 1457-1466.
  • Kowsari K., Jafari Meimandi K., Heidarysafa M., Mendu S., Barnes L., Brown D. Text classification algorithms: A survey. Information 2019; 10(4): 150.
  • Li C., Zhan G., Li Z. News text classification based on improved Bi-LSTM-CNN. In 2018 9th International conference on information technology in medicine and education (ITME) 2018; (pp. 890-893). IEEE.
  • Liu CZ., Sheng YX., Wei ZQ., Yang YQ. Research of text classification based on improved TF-IDF algorithm. In 2018 IEEE International Conference of Intelligent Robotic and Control Engineering (IRCE) 2018; (pp. 218-222). IEEE.

İngilizce Metinlerin Sınıflandırması İçin Makine Öğrenimi Kullanımı

Year 2024, Volume: 7 Issue: 1, 414 - 423, 22.01.2024
https://doi.org/10.47495/okufbed.1259868

Abstract

Bilgileri almak ve metinden verileri çıkarmak için sınıflandırma kullanmak, okuyucunun içeriği anlamasını da artırır. Teknolojideki ilerlemelerin bir sonucu olarak, önceki yöntemlerden hem çok daha hızlı hem de çok daha güvenli yeni metin sınıflama metodları ortaya çıktı. Nitekim, “duygu analizi, dil tespiti, niyet tespiti, spam algılama ve konu tespiti” dahil olmak üzere bugüne kadar metni sınıflandırmak için çeşitli faktörler kullanılmıştır. İngilizce dilbiliminde, en yaygın sorunlardan biri metinleri okunabilirlik düzeylerine göre sınıflandırmaktır. Bu makalede, yazarlar İngilizce metninin zorluk ve okunabilirlik düzeyi açısından sınıflandırılmasında makine öğrenimi kullanımınına yönelik yöntemleri derlemişlerdir Ayrıca, yazarlar derin öğrenme yönteminin dezavantajlarını doğruluk ve eylem hızı açısından tartışmaktadır.

References

  • Altınel B., Ganiz MC. Semantic text classification: A survey of past and recent advances. Information Processing and Management 2018; 54(6): 1129-1153.
  • Basiri ME., Abdar M., Cifci MA., Nemati S., Acharya UR. A novel method for sentiment classification of drug reviews using fusion of deep and machine learning techniques. Knowledge-Based Systems 2020; 198: 105949.
  • Boser BE., Guyon IM., Vapnik VN. A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory 1992; 144-152.
  • Brunello A., Marzano E., Montanari A., Sciavicco G. J48S: A sequence classification approach to text analysis based on decision trees. In International Conference on Information and Software Technologies 2018; 240-256, Springer, Cham.
  • Cai L., Gu J., Ma J., Jin Z. Probabilistic wind power forecasting approach via instance-based transfer learning embedded gradient boosting decision trees. Energies 2019; 12(1): 159.
  • Cervantes J., Garcia-Lamont F., Rodríguez-Mazahua L., Lopez A. A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing 2020; 408: 189-215.
  • Deng X., Li Y., Weng J., Zhang J. Feature selection for text classification: A review. Multimedia Tools and Applications 2019; 78(3): 3797-3816.
  • Elghazel H., Aussem A., Gharroudi O., Saadaoui W. Ensemble multi-label text categorization based on rotation forest and latent semantic indexing. Expert Systems with Applications 2016; 57: 1-11.
  • Ghareb AS., Bakar AA., Hamdan AR. Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Systems with Applications 2016; 49: 31-47.
  • Harahap F., Harahap AYN., Ekadiansyah E., Sari RN., Adawiyah R., Harahap CB. Implementation of Naïve Bayes classification method for predicting purchase. In 2018 6th International Conference on Cyber and IT Service Management (CITSM) 2018; (pp. 1-5). IEEE.
  • Hirway C., Fallon E., Conolly P., Flanagan K., Yadav D. Determining receipt validity from e-mail subject line using feature extraction and binary classifiers. International Journal of Simulation--Systems, Science and Technology 2022; 23(2).
  • Kavitha M., Prabhavathy P. A review on machine learning techniques for text classification. In 2021 4th International Conference on Computing and Communications Technologies (ICCCT) 2021; (pp. 605-610). IEEE.
  • Kim SB., Han KS., Rim HC., Myaeng SH. Some effective techniques for naive bayes text classification. IEEE Transactions on Knowledge and Data Engineering 2006; 18(11): 1457-1466.
  • Kowsari K., Jafari Meimandi K., Heidarysafa M., Mendu S., Barnes L., Brown D. Text classification algorithms: A survey. Information 2019; 10(4): 150.
  • Li C., Zhan G., Li Z. News text classification based on improved Bi-LSTM-CNN. In 2018 9th International conference on information technology in medicine and education (ITME) 2018; (pp. 890-893). IEEE.
  • Liu CZ., Sheng YX., Wei ZQ., Yang YQ. Research of text classification based on improved TF-IDF algorithm. In 2018 IEEE International Conference of Intelligent Robotic and Control Engineering (IRCE) 2018; (pp. 218-222). IEEE.
There are 16 citations in total.

Details

Primary Language English
Subjects Computer Software
Journal Section REVIEWS
Authors

Somayyeh Shabestanı

Merve Geçikli 0000-0002-8619-5026

Publication Date January 22, 2024
Submission Date March 3, 2023
Acceptance Date June 6, 2023
Published in Issue Year 2024 Volume: 7 Issue: 1

Cite

APA Shabestanı, S., & Geçikli, M. (2024). Machine learning use for English texts’ classification (A mini-review). Osmaniye Korkut Ata Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 7(1), 414-423. https://doi.org/10.47495/okufbed.1259868
AMA Shabestanı S, Geçikli M. Machine learning use for English texts’ classification (A mini-review). Osmaniye Korkut Ata University Journal of The Institute of Science and Techno. January 2024;7(1):414-423. doi:10.47495/okufbed.1259868
Chicago Shabestanı, Somayyeh, and Merve Geçikli. “Machine Learning Use for English texts’ Classification (A Mini-Review)”. Osmaniye Korkut Ata Üniversitesi Fen Bilimleri Enstitüsü Dergisi 7, no. 1 (January 2024): 414-23. https://doi.org/10.47495/okufbed.1259868.
EndNote Shabestanı S, Geçikli M (January 1, 2024) Machine learning use for English texts’ classification (A mini-review). Osmaniye Korkut Ata Üniversitesi Fen Bilimleri Enstitüsü Dergisi 7 1 414–423.
IEEE S. Shabestanı and M. Geçikli, “Machine learning use for English texts’ classification (A mini-review)”, Osmaniye Korkut Ata University Journal of The Institute of Science and Techno, vol. 7, no. 1, pp. 414–423, 2024, doi: 10.47495/okufbed.1259868.
ISNAD Shabestanı, Somayyeh - Geçikli, Merve. “Machine Learning Use for English texts’ Classification (A Mini-Review)”. Osmaniye Korkut Ata Üniversitesi Fen Bilimleri Enstitüsü Dergisi 7/1 (January 2024), 414-423. https://doi.org/10.47495/okufbed.1259868.
JAMA Shabestanı S, Geçikli M. Machine learning use for English texts’ classification (A mini-review). Osmaniye Korkut Ata University Journal of The Institute of Science and Techno. 2024;7:414–423.
MLA Shabestanı, Somayyeh and Merve Geçikli. “Machine Learning Use for English texts’ Classification (A Mini-Review)”. Osmaniye Korkut Ata Üniversitesi Fen Bilimleri Enstitüsü Dergisi, vol. 7, no. 1, 2024, pp. 414-23, doi:10.47495/okufbed.1259868.
Vancouver Shabestanı S, Geçikli M. Machine learning use for English texts’ classification (A mini-review). Osmaniye Korkut Ata University Journal of The Institute of Science and Techno. 2024;7(1):414-23.

23487


196541947019414

19433194341943519436 1960219721 197842261021238 23877

*This journal is an international refereed journal 

*Our journal does not charge any article processing fees over publication process.

* This journal is online publishes 5 issues per year (January, March, June, September, December)

*This journal published in Turkish and English as open access. 

19450 This work is licensed under a Creative Commons Attribution 4.0 International License.