Research Article
BibTex RIS Cite
Year 2021, Volume: 34 Issue: 3, 718 - 731, 01.09.2021
https://doi.org/10.35378/gujs.715296

Abstract

References

  • [1] Domala, J., Dogra, M., Masrani, V., Fernandes, D., D'souza, K., Fernandes, D., & Carvalho, T., “Automated Identification of Disaster News for Crisis Management using Machine Learning and Natural Language Processing”, In 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), 503-508, (2020).
  • [2] Alshehri, A., & Alahamri, S., “An Ensemble Learning for Detecting Situational Awareness Tweets during Environmental Hazards”, In 2019 IEEE International Systems Conference (SysCon), 1-8, (2019).
  • [3] Kumar, A., Singh, J. P., & Saumya, S., “A Comparative Analysis of Machine Learning Techniques for Disaster-Related Tweet Classification”, In 2019 IEEE R10 Humanitarian Technology Conference (R10-HTC), 222-227, (2019).
  • [4] Nalluru, G., Pandey, R., & Purohit, H., “Relevancy classification of multimodal social media streams for emergency services”, In 2019 IEEE International Conference on Smart Computing (SMARTCOMP), 121-125, (2019).
  • [5] Ayata, D., Saraçlar, M., & Özgür, A., “Turkish tweet sentiment analysis with word embedding and machine learning”, In 2017 25th Signal Processing and Communications Applications Conference (SIU), 1-4, (2017).
  • [6] Naili, M., Chaibi, A. H., & Ghezala, H. H. B., “Comparative study of word embedding methods in topic segmentation”, Procedia Computer Science, 112, 340-349, (2017).
  • [7] Mikolov, T., Chen, K., Corrado, G., & Dean, J., “Efficient estimation of word representations in vector space”, arXiv preprint arXiv: 1301.3781, (2013).
  • [8] Şahin, G., Turkish document classification based on Word2Vec and SVM classifier”, In 2017 25th Signal Processing and Communications Applications Conference (SIU), 1-4, (2017).
  • [9] Aydoğan, M., & Karci, A., “Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification”, Physica A: Statistical Mechanics and its Applications, 541, 123288, (2020).
  • [10] Kılınç, D., Özçift, A., Bozyigit, F., Yıldırım, P., Yücalar, F., & Borandag, E., “TTC-3600: A new benchmark dataset for Turkish text categorization”, Journal of Information Science, 43(2), 174-185, (2017).
  • [11] Kılınç, D., “The Effect of Ensemble Learning Models on Turkish Text Classification”, Celal Bayar Üniversitesi Fen Bilimleri Dergisi, 12(2), (2016).
  • [12] Demirci, G. M., Keskin, Ş. R., & Doğan, G., "Sentiment Analysis in Turkish with Deep Learning", In 2019 IEEE International Conference on Big Data (Big Data), 2215-2221. IEEE, (2019).
  • [13] BaygIn, M., “Classification of text documents based on Naive Bayes using N-Gram features”, In 2018 International Conference on Artificial Intelligence and Data Processing (IDAP), 1-5, (2018).
  • [14] Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H., & Mikolov, T., “Fasttext. zip: Compressing text classification models”, arXiv preprint arXiv: 1612.03651, (2016).
  • [15] Pennington, J., Socher, R., & Manning, C. D., “Glove: Global vectors for word representation”, In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532-1543, (2014).
  • [16] Cunningham, P., & Delany, S. J., “k-Nearest neighbour classifiers”, Multiple Classifier Systems, 34(8), 1-17, (2007).
  • [17] Nikhath, A. K., Subrahmanyam, K., & Vasavi, R., “Building a K-Nearest Neighbor Classifier for Text Categorization”, International Journal of Computer Science and Information Technologies, 7(1), 254-256, (2016).
  • [18] Frank, E., & Bouckaert, R. R., “Naive bayes for text classification with unbalanced classes”, In European Conference on Principles of Data Mining and Knowledge Discovery, Springer, Berlin, Heidelberg, 503-510, (2006).
  • [19] Dadgar, S. M. H., Araghi, M. S., & Farahani, M. M., “A novel text mining approach based on TF-IDF and Support Vector Machine for news classification”, In 2016 IEEE International Conference on Engineering and Technology (ICETECH), 112-116, (2016).
  • [20] Dietterich, T. G., “Ensemble methods in machine learning”, In International Workshop on Multiple Classifier Systems, Springer, Berlin, Heidelberg, 1-15, (2000).
  • [21] Onan, A., Korukoğlu, S., & Bulut, H., “Ensemble of keyword extraction methods and classifiers in text classification”, Expert Systems with Applications, 57, 232-247, (2016).
  • [22] Elith, J., “Machine Learning, Random Forests, and Boosted Regression Trees”, Quantitative Analyses in Wildlife Science, 281, (2019).
  • [23] Rodriguez, J. J., Kuncheva, L. I., & Alonso, C. J., “Rotation forest: A new classifier ensemble method”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1619-1630, (2006).
  • [24] Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., & Liu, T. Y., “Lightgbm: A highly efficient gradient boosting decision tree”, In Advances in Neural Information Processing Systems, 3146-3154, (2017).
  • [25] Ragini, J. R., & Anand, P. R., “An empirical analysis and classification of crisis related tweets”, In 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), 1-4, (2016).

Efficient Turkish Text Classification Approach for Crisis Management Systems

Year 2021, Volume: 34 Issue: 3, 718 - 731, 01.09.2021
https://doi.org/10.35378/gujs.715296

Abstract

In this paper, an effective tweet classification system that fully supports the Turkish language has been developed. The proposed system can be used for mining (classifying) the recently published and publicly available tweets to find the crisis’s most related and useful tweets to gain situational awareness, which can help in taking the correct responses in order to prevent or at least decrease the effect of such situations. A deep study was carried out to improve and optimize the proposed system. In more detail, some intensive experiments were performed to investigate the performance of some well-known machine learning algorithms, i.e., K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Naive Bayes (NB) when used for text (tweets) classification. Then, the performances of the ensemble systems of the studied algorithms and the Random Forest (RF), AdaBoost Classifier (AdaBoost), GradientBoosting Classifier (GBC) ensemble systems have also been observed. As shown in the experimental evaluation and analysis, the proposed approach has stability, robustness, and can achieve quite good performance when processing the Turkish language. The performance of the proposed classifier was also compared with two state-of-the-art text classification approaches, i.e., "Empirical" and “Turkish Deep ".

References

  • [1] Domala, J., Dogra, M., Masrani, V., Fernandes, D., D'souza, K., Fernandes, D., & Carvalho, T., “Automated Identification of Disaster News for Crisis Management using Machine Learning and Natural Language Processing”, In 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), 503-508, (2020).
  • [2] Alshehri, A., & Alahamri, S., “An Ensemble Learning for Detecting Situational Awareness Tweets during Environmental Hazards”, In 2019 IEEE International Systems Conference (SysCon), 1-8, (2019).
  • [3] Kumar, A., Singh, J. P., & Saumya, S., “A Comparative Analysis of Machine Learning Techniques for Disaster-Related Tweet Classification”, In 2019 IEEE R10 Humanitarian Technology Conference (R10-HTC), 222-227, (2019).
  • [4] Nalluru, G., Pandey, R., & Purohit, H., “Relevancy classification of multimodal social media streams for emergency services”, In 2019 IEEE International Conference on Smart Computing (SMARTCOMP), 121-125, (2019).
  • [5] Ayata, D., Saraçlar, M., & Özgür, A., “Turkish tweet sentiment analysis with word embedding and machine learning”, In 2017 25th Signal Processing and Communications Applications Conference (SIU), 1-4, (2017).
  • [6] Naili, M., Chaibi, A. H., & Ghezala, H. H. B., “Comparative study of word embedding methods in topic segmentation”, Procedia Computer Science, 112, 340-349, (2017).
  • [7] Mikolov, T., Chen, K., Corrado, G., & Dean, J., “Efficient estimation of word representations in vector space”, arXiv preprint arXiv: 1301.3781, (2013).
  • [8] Şahin, G., Turkish document classification based on Word2Vec and SVM classifier”, In 2017 25th Signal Processing and Communications Applications Conference (SIU), 1-4, (2017).
  • [9] Aydoğan, M., & Karci, A., “Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification”, Physica A: Statistical Mechanics and its Applications, 541, 123288, (2020).
  • [10] Kılınç, D., Özçift, A., Bozyigit, F., Yıldırım, P., Yücalar, F., & Borandag, E., “TTC-3600: A new benchmark dataset for Turkish text categorization”, Journal of Information Science, 43(2), 174-185, (2017).
  • [11] Kılınç, D., “The Effect of Ensemble Learning Models on Turkish Text Classification”, Celal Bayar Üniversitesi Fen Bilimleri Dergisi, 12(2), (2016).
  • [12] Demirci, G. M., Keskin, Ş. R., & Doğan, G., "Sentiment Analysis in Turkish with Deep Learning", In 2019 IEEE International Conference on Big Data (Big Data), 2215-2221. IEEE, (2019).
  • [13] BaygIn, M., “Classification of text documents based on Naive Bayes using N-Gram features”, In 2018 International Conference on Artificial Intelligence and Data Processing (IDAP), 1-5, (2018).
  • [14] Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H., & Mikolov, T., “Fasttext. zip: Compressing text classification models”, arXiv preprint arXiv: 1612.03651, (2016).
  • [15] Pennington, J., Socher, R., & Manning, C. D., “Glove: Global vectors for word representation”, In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532-1543, (2014).
  • [16] Cunningham, P., & Delany, S. J., “k-Nearest neighbour classifiers”, Multiple Classifier Systems, 34(8), 1-17, (2007).
  • [17] Nikhath, A. K., Subrahmanyam, K., & Vasavi, R., “Building a K-Nearest Neighbor Classifier for Text Categorization”, International Journal of Computer Science and Information Technologies, 7(1), 254-256, (2016).
  • [18] Frank, E., & Bouckaert, R. R., “Naive bayes for text classification with unbalanced classes”, In European Conference on Principles of Data Mining and Knowledge Discovery, Springer, Berlin, Heidelberg, 503-510, (2006).
  • [19] Dadgar, S. M. H., Araghi, M. S., & Farahani, M. M., “A novel text mining approach based on TF-IDF and Support Vector Machine for news classification”, In 2016 IEEE International Conference on Engineering and Technology (ICETECH), 112-116, (2016).
  • [20] Dietterich, T. G., “Ensemble methods in machine learning”, In International Workshop on Multiple Classifier Systems, Springer, Berlin, Heidelberg, 1-15, (2000).
  • [21] Onan, A., Korukoğlu, S., & Bulut, H., “Ensemble of keyword extraction methods and classifiers in text classification”, Expert Systems with Applications, 57, 232-247, (2016).
  • [22] Elith, J., “Machine Learning, Random Forests, and Boosted Regression Trees”, Quantitative Analyses in Wildlife Science, 281, (2019).
  • [23] Rodriguez, J. J., Kuncheva, L. I., & Alonso, C. J., “Rotation forest: A new classifier ensemble method”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1619-1630, (2006).
  • [24] Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., & Liu, T. Y., “Lightgbm: A highly efficient gradient boosting decision tree”, In Advances in Neural Information Processing Systems, 3146-3154, (2017).
  • [25] Ragini, J. R., & Anand, P. R., “An empirical analysis and classification of crisis related tweets”, In 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), 1-4, (2016).
There are 25 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Computer Engineering
Authors

Saed Alqaraleh 0000-0002-7146-3905

Publication Date September 1, 2021
Published in Issue Year 2021 Volume: 34 Issue: 3

Cite

APA Alqaraleh, S. (2021). Efficient Turkish Text Classification Approach for Crisis Management Systems. Gazi University Journal of Science, 34(3), 718-731. https://doi.org/10.35378/gujs.715296
AMA Alqaraleh S. Efficient Turkish Text Classification Approach for Crisis Management Systems. Gazi University Journal of Science. September 2021;34(3):718-731. doi:10.35378/gujs.715296
Chicago Alqaraleh, Saed. “Efficient Turkish Text Classification Approach for Crisis Management Systems”. Gazi University Journal of Science 34, no. 3 (September 2021): 718-31. https://doi.org/10.35378/gujs.715296.
EndNote Alqaraleh S (September 1, 2021) Efficient Turkish Text Classification Approach for Crisis Management Systems. Gazi University Journal of Science 34 3 718–731.
IEEE S. Alqaraleh, “Efficient Turkish Text Classification Approach for Crisis Management Systems”, Gazi University Journal of Science, vol. 34, no. 3, pp. 718–731, 2021, doi: 10.35378/gujs.715296.
ISNAD Alqaraleh, Saed. “Efficient Turkish Text Classification Approach for Crisis Management Systems”. Gazi University Journal of Science 34/3 (September 2021), 718-731. https://doi.org/10.35378/gujs.715296.
JAMA Alqaraleh S. Efficient Turkish Text Classification Approach for Crisis Management Systems. Gazi University Journal of Science. 2021;34:718–731.
MLA Alqaraleh, Saed. “Efficient Turkish Text Classification Approach for Crisis Management Systems”. Gazi University Journal of Science, vol. 34, no. 3, 2021, pp. 718-31, doi:10.35378/gujs.715296.
Vancouver Alqaraleh S. Efficient Turkish Text Classification Approach for Crisis Management Systems. Gazi University Journal of Science. 2021;34(3):718-31.