Research Article
BibTex RIS Cite

Kelime Vektörü Yöntemlerinin Model Oluşturma Sürelerinin Karşılaştırılması

Year 2019, , 141 - 146, 30.04.2019
https://doi.org/10.17671/gazibtd.472226

Abstract

Bu çalışmada duygu analizi için oluşturulan iki farklı veri kümesi, kelime vektörü algoritması olan Word2Vec ile modellenmiştir. Model oluşturulurken Word2Vec’in iki farklı yöntemi olan CBoW (Continous Bag of Words) ve Skip-Gram kullanılmıştır. Word2Vec ile bir metnin modelini oluşturmak için genellikle Ortalama yöntemi kullanılmaktadır. Bu çalışmada hem CBoW hem de Skip-Gram yöntemleriyle bir metni modellemek için üç farklı yöntem önerilmiştir. Model oluşturma (eğitim zamanı) süreleri her ikisi içinde ölçülmüştür. Sonuç olarak modelleme süresi açısından CBoW’un Skip-Gram’dan daha başarılı olduğu deneysel olarak gösterilmiştir.

References

  • [1] D. Zhang, H. Xu, Z. Su, Y. Xu, “Chinese comments sentiment classification based on word2vec and SVMperf”, Expert Systems with Applications, 42(4), 1857-1863, 2015.
  • [2] B. Dickinson, W. Hu, “Sentiment analysis of investor opinions on Twitter”, Social Networking, 4(03), 62-71, 2015.
  • [3] D. Tang, F. Wei, N. Yang, M. Zhou, T. Liu, B. Qin, “Learning sentiment-specific word embedding for twitter sentiment classification”, 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland, USA,1555-1565, 23-25 Haziran, 2014.
  • [4] J. Polpinij, N. Srikanjanapert, P. Sopon, “Word2Vec Approach for Sentiment Classification Relating to Hotel Reviews”, International Conference on Computing and Information Technology, Bangkok, Thailand, 308-316, 6-7 Temmuz, 2017.
  • [5] G. Şahin, “Turkish document classification based on Word2Vec and SVM classifier”, Signal Processing and Communications Applications Conference, Antalya, Turkey,1-4,15-18 Mayıs, 2017.
  • [6] B. Xue, C. Fu, Z. Shaobin, “A study on sentiment computing and classification of sina weibo with word2vec”, International Congress on Big Data, Anchorage, AK, USA, 358-363, 27 Haziran-2 Temmuz, 2014.
  • [7] M. Bilgin., İ.F. Şentürk, “Sentiment analysis on Twitter data with semi-supervised Doc2Vec”, International Conference on Computer Science and Engineering, Antalya, Turkey, 661-666, 5-8 Ekim, 2017.
  • [8] M. Bilgin, H. Köktaş, “Word2Vec Based Sentiment Analysis for Turkish Texts”, International Conference on Engineering Technologies, Konya, Turkey, 106-109, 26-28 Ekim, 2017.
  • [9] A. El Mahdaouy, E. Gaussier, S. O. El Alaoui, “Arabic Text Classification Based on Word and Document Embeddings”, International Conference on Advanced Intelligent Systems and Informatics, Cairo, Eygpt, 32-41, 24-26 Ekim, 2016.
  • [10] Ö. Çoban, I. Karabey, “Music genre classification with word and document vectors”, Signal Processing and Communications Applications Conference, Antalya, Turkey, 1-4, 15-18 Mayıs, 2017.
  • [11] H. Ma, X. Wiang, J. Hou, Y. Lu, “Course recommendation based on semantic similarity analysis”, International Conference on Control Science and Systems Engineering, Beijing, China, 638-641, 17-19 Ağustos, 2017.
  • [12] M. Razzaghnoori, S. Hedieh, K.J. Iman, “Question classification in Persian using word vectors and frequencies”, Cognitive Systems Research, 47, 16-27, 2018.
  • [13] H. Polat, M. Körpe, “TBMM Genel Kurul Tutanaklarından Yakın Anlamlı Kavramların Çıkarılması”, Bilişim Teknolojileri Dergisi, 11(3), 235-244, 2018.
  • [14] T. Mikolov, K. Chen, G. Corrado, J. Dean, “Efficient Estimation of Word Representations in Vector Space”, International Conference on Learning Representations, Arizona, USA, 1-12, 2-4 Mayıs, 2013.
  • [15] Internet: X Data Set, https:// www.dropbox.com/s/aji68llxmtcuu5l/data.zip, 05.09.2018.
  • [16] Internet: Y Data Set, https://www.kaggle.com/c/si650winter11/data, 05.09.2018
  • [17] M.R. Berthold, N. Cebron, F. Dill, T.R. Gabriel, T. Kötter, T. Meinl, P. Ohl, K. Thiel, B. Wiswedel, “KNIME-the Konstanz information miner: version 2.0 and beyond”, AcM SIGKDD explorations Newsletter, 11(1), 26-31, 2009.
  • [18] Internet: Mean, https://www.wikipedia.com, 08.09.2018.
  • [19] Internet: Mean-Median, https://byjus.com/mean-median-mode-formula, 08.09.2018.

Comparison of Modeling Time of Word Vector Methods

Year 2019, , 141 - 146, 30.04.2019
https://doi.org/10.17671/gazibtd.472226

Abstract

In this study, two different datasets for sentiment analysis have been modeled by Word2Vec that it is a word vector algorithm. While the model is creating that has used two different methods CBoW and Skip-Gram of Word2Vec. Generally, the arithmetic mean is used for modeling a text with Word2Vec. In this study, three different methods for modeling a text are suggested on both CBoW and Skip-Gram. Its modeling time (training time) is measured.  As a result, it was experimentally shown that CBoW is more successful than Skip-Gram based for modeling time.

References

  • [1] D. Zhang, H. Xu, Z. Su, Y. Xu, “Chinese comments sentiment classification based on word2vec and SVMperf”, Expert Systems with Applications, 42(4), 1857-1863, 2015.
  • [2] B. Dickinson, W. Hu, “Sentiment analysis of investor opinions on Twitter”, Social Networking, 4(03), 62-71, 2015.
  • [3] D. Tang, F. Wei, N. Yang, M. Zhou, T. Liu, B. Qin, “Learning sentiment-specific word embedding for twitter sentiment classification”, 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland, USA,1555-1565, 23-25 Haziran, 2014.
  • [4] J. Polpinij, N. Srikanjanapert, P. Sopon, “Word2Vec Approach for Sentiment Classification Relating to Hotel Reviews”, International Conference on Computing and Information Technology, Bangkok, Thailand, 308-316, 6-7 Temmuz, 2017.
  • [5] G. Şahin, “Turkish document classification based on Word2Vec and SVM classifier”, Signal Processing and Communications Applications Conference, Antalya, Turkey,1-4,15-18 Mayıs, 2017.
  • [6] B. Xue, C. Fu, Z. Shaobin, “A study on sentiment computing and classification of sina weibo with word2vec”, International Congress on Big Data, Anchorage, AK, USA, 358-363, 27 Haziran-2 Temmuz, 2014.
  • [7] M. Bilgin., İ.F. Şentürk, “Sentiment analysis on Twitter data with semi-supervised Doc2Vec”, International Conference on Computer Science and Engineering, Antalya, Turkey, 661-666, 5-8 Ekim, 2017.
  • [8] M. Bilgin, H. Köktaş, “Word2Vec Based Sentiment Analysis for Turkish Texts”, International Conference on Engineering Technologies, Konya, Turkey, 106-109, 26-28 Ekim, 2017.
  • [9] A. El Mahdaouy, E. Gaussier, S. O. El Alaoui, “Arabic Text Classification Based on Word and Document Embeddings”, International Conference on Advanced Intelligent Systems and Informatics, Cairo, Eygpt, 32-41, 24-26 Ekim, 2016.
  • [10] Ö. Çoban, I. Karabey, “Music genre classification with word and document vectors”, Signal Processing and Communications Applications Conference, Antalya, Turkey, 1-4, 15-18 Mayıs, 2017.
  • [11] H. Ma, X. Wiang, J. Hou, Y. Lu, “Course recommendation based on semantic similarity analysis”, International Conference on Control Science and Systems Engineering, Beijing, China, 638-641, 17-19 Ağustos, 2017.
  • [12] M. Razzaghnoori, S. Hedieh, K.J. Iman, “Question classification in Persian using word vectors and frequencies”, Cognitive Systems Research, 47, 16-27, 2018.
  • [13] H. Polat, M. Körpe, “TBMM Genel Kurul Tutanaklarından Yakın Anlamlı Kavramların Çıkarılması”, Bilişim Teknolojileri Dergisi, 11(3), 235-244, 2018.
  • [14] T. Mikolov, K. Chen, G. Corrado, J. Dean, “Efficient Estimation of Word Representations in Vector Space”, International Conference on Learning Representations, Arizona, USA, 1-12, 2-4 Mayıs, 2013.
  • [15] Internet: X Data Set, https:// www.dropbox.com/s/aji68llxmtcuu5l/data.zip, 05.09.2018.
  • [16] Internet: Y Data Set, https://www.kaggle.com/c/si650winter11/data, 05.09.2018
  • [17] M.R. Berthold, N. Cebron, F. Dill, T.R. Gabriel, T. Kötter, T. Meinl, P. Ohl, K. Thiel, B. Wiswedel, “KNIME-the Konstanz information miner: version 2.0 and beyond”, AcM SIGKDD explorations Newsletter, 11(1), 26-31, 2009.
  • [18] Internet: Mean, https://www.wikipedia.com, 08.09.2018.
  • [19] Internet: Mean-Median, https://byjus.com/mean-median-mode-formula, 08.09.2018.
There are 19 citations in total.

Details

Primary Language Turkish
Subjects Computer Software
Journal Section Articles
Authors

Metin Bilgin 0000-0002-4216-0542

Publication Date April 30, 2019
Submission Date October 19, 2018
Published in Issue Year 2019

Cite

APA Bilgin, M. (2019). Kelime Vektörü Yöntemlerinin Model Oluşturma Sürelerinin Karşılaştırılması. Bilişim Teknolojileri Dergisi, 12(2), 141-146. https://doi.org/10.17671/gazibtd.472226