TY - JOUR T1 - Siyaset biliminde otomatik metin analizi yöntemleri ve uygulama alanları TT - Automated text analysis methods and application areas in political science AU - Aydoğan Ünal, Betül PY - 2023 DA - June DO - 10.17218/hititsbd.1260739 JF - Hitit Sosyal Bilimler Dergisi PB - Hitit Üniversitesi WT - DergiPark SN - 2757-7449 SP - 190 EP - 208 VL - 16 IS - 1 LA - tr AB - Otomatik metin analizi, büyük boyuttaki metin verilerini daha önce mümkün olmayan yollarla analiz etme yeteneği sayesinde siyaset biliminde hızla büyüyen bir alan haline gelmiştir. Ancak, metinsel verileri analiz etmek için pek çok farklı yöntemin bulunması, araştırmacıların araştırma soruları ve verileri için en uygun yaklaşımı belirleme sürecini zorlaştırmaktadır. Bu makale, siyasi olguları incelemek için kullanılan farklı otomatik metin analizi yöntemleri arasından basit istatistiksel analizler, denetimli/denetimsiz makine öğrenmesi, dağılımsal semantik modeller ve kelime gömme yöntemlerini ele alarak araştırmacılara kapsamlı bir kaynak sunmayı amaçlamaktadır. Basit sıklık dağılımlarının hesaplanması ve benzerlik/uzaklık ölçümlerinin kullanımı gibi temel yöntemlerin yanı sıra daha gelişmiş yöntemlerin temel varsayımları, ürettiği çıktılar, güçlü ve zayıf yönleri karşılaştırmalı olarak ele alınmaktadır. Bu çalışma, bu yöntemlerin siyaset bilimine katkı sağlama potansiyelini vurgulamakla birlikte uygulama alanlarından örnekler sunmaktadır. KW - Otomatik Metin Analizi KW - Siyaset Bilimi KW - Büyük Veri KW - Makine Öğrenmesi KW - Araştırma Yöntemleri N2 - Automated text analysis has become a rapidly growing field in political science due to its ability to analyze large-scale textual data in ways that were not previously possible. However, because there are many different methods available for analyzing textual data, it can be difficult for researchers to choose the most appropriate approach for their research questions and data. This article provides a general overview of the use of statistical summaries, supervised and unsupervised machine learning, distributional semantic models, and word embedding methods for examining political phenomena. It compares the data requirements, outputs produced, basic assumptions, advantages, and disadvantages of not only basic methods such as calculating simple frequency distributions and similarity/distance measurements but also more advanced methods. While emphasizing the potential contribution of these methods to political science, this study provides examples from application areas. CR - Atalay, M. ve Çelik, E. (2017). Büyük veri analizinde yapay zekâ ve makine öğrenmesi uygulamalari-artificial intelligence and machine learning applications in big data analysis. Mehmet Akif Ersoy Üniversitesi Sosyal Bilimler Enstitüsü Dergisi, 9(22), 155-172. doi:10.20875/makusobed.309727 CR - Athey, S. (2018). The impact of machine learning on economics. A. Agrawal, J. Gans ve A. Goldfarb (Ed.), The economics of artificial intelligence: An agenda (s.507-547) içinde. Chicago: University of Chicago Press. CR - Aydoğan, M. ve Karcı, A. (2019). Kelime temsil yöntemleri ile kelime benzerliklerinin incelenmesi. Çukurova Üniversitesi Mühendislik-Mimarlık Fakültesi Dergisi, 34(2), 181-196. doi:10.21605/cukurovaummfd.609119 CR - Benoit, K. (2020). Text as data: An overview. L. Curini and R. Franzese (Ed.), The handbook of research methods in political science and international relations (ss. 461-497) içinde. Tthousand Oaks: Sage. CR - Benoit, K. ve Laver, M. (2003). Estimating Irish party policy positions using computer wordscoring: The 2002 election–a research note. Irish political studies, 18(1), 97-107. doi:10.1080/07907180312331293249 CR - Bisong, E. (2019). Google AutoML: cloud natural language processing. Building machine learning and deep learning models on google cloud platform: a comprehensive guide for beginners, 599-612. doi: 10.1007/978-1-4842-4470-8_43 CR - Bouchart, S. (2020). Classification and clustering. SAGE Publications Ltd. doi:10.4135/9781526486387 CR - Budge, I. ve Pennings, P. (2007). Do they work? Validating computerised word frequency estimates against policy series. Electoral Studies, 26(1), 121-129. doi:10.1016/j.electstud.2006.04.002 CR - Di Cocco, J. ve Monechi, B. (2022). How populist are parties? Measuring degrees of populism in party manifestos using supervised machine learning. Political Analysis, 30(3), 311-327. doi:10.1017/pan.2021.29 CR - Diermeier, D., Godbout, J. F., Yu, B. ve Kaufmann, S. (2012). Language and ideology in Congress. British Journal of Political Science, 42(1), 31-55. doi: 10.1017/S0007123411000160 CR - Eggers, A. C., ve Spirling, A. (2018). The shadow cabinet in Westminster systems: modeling opposition agenda setting in the House of Commons, 1832–1915. British Journal of Political Science, 48(2), 343-367. doi:10.1017/S0007123416000016 CR - Evans, M., McIntosh, W., Lin, J. ve Cates, C. (2007). Recounting the courts? Applying automated content analysis to enhance empirical legal research. Journal of Empirical Legal Studies, 4(4), 1007-1039. doi: 10.1111/j.1740-1461.2007.00113.x CR - Frid-Nielsen, S. S. (2018). Human rights or security? Positions on asylum in European Parliament speeches. European union politics, 19(2), 344-362. doi: 10.1613/jair.1.13112 CR - Gee, J. P. (2018). Reading as situated language: A sociocognitive perspective. In Theoretical models and processes of literacy (s.105-117). New York: Routledge. CR - Godel, W. (2022). Ideology, Social Media and Fake News: New Machine Learning Methods for Political Science (Yayımlanmamış doktora tezi). Wilf Family Department of Politics, New York University. CR - Gökçe, O. (2006). İçerik analizi-kuramsal ve pratik bilgiler. Ankara: Siyasal Kitabevi CR - Grimmer, J. (2010). A bayesian hierarchical topic model for political texts: measuring expressed agendas in Senate press releases. Political Analysis, 18(1), 1-35. doi: 10.1093/pan/mpp034 CR - Grimmer, J., Roberts, M.E. ve Stewart, B.M. (2021). Machine learning for social science: an agnostic approach. Annual Review of Political Science, 24, 395-419. doi: 10.1146/annurev-polisci-053119-015921 CR - Grimmer, J., Roberts, M.E. ve Stewart, B.M. (2022). Text as data: a new framework for machine learning and the social sciences. New Jersey: Princeton University Press. CR - Grimmer, J. ve Stewart, B. M. (2013). Text as data: the promise and pitfalls of automatic content analysis methods for political texts. Political Analysis, 21(3), 267-297. doi:10.1093/pan/mps028 CR - Gül, S.S. ve Nizam, Ö.K. (2021). Sosyal bilimlerde içerik ve söylem analizi. Pamukkale Üniversitesi Sosyal Bilimler Enstitüsü Dergisi, 42, 181-198. doi: 10.30794/pausbed.803182 CR - Gyasi, W.K. (2023). The readability of political party manifestos of the 2016 general elections in Ghana. Athens Journal of Mass Media and Communications, 9(1), 57-70. doi:10.30958/ajmmc CR - Hatipoğlu, E., Gökçe, O.Z., Arın, İ. ve Saygın, Y. (2022). Otomatik metin analizi ve uluslararası ilişkiler. E. Aydınlı (Der.). Uluslararası İlişkiler Metodolojisi içinde (s.135-166). İstanbul: Koç Üniversitesi Yayınları. CR - Hjorth, F., Klemmensen, R., Hobolt, S., Hansen, M.E. ve Kurrild-Klitgaard, P. (2015). Computers, coders, and voters: comparing automated methods for estimating party positions. Research & Politics, 2(2), 1-9. doi: 10.1177/2053168015580476 CR - Kapočiūtė-Dzikienė, J. ve Krupavičius, A. (2014). Predicting party group from the Lithuanian parliamentary speeches. Information Technology and Control, 43(3), 321-332. doi:10.5755/j01.itc.43.3.5871 CR - Kaynar, O., Görmez, Y., Yıldız, M. ve Albayrak, A. (2016). Makine öğrenmesi yöntemleri ile duygu analizi. International Artificial Intelligence and Data Processing Symposium (IDAP’16), 234-241. CR - Kılıç, H., Atalay, E. ve Yurtsever, A.E. (2019). Büyük veri (Bigdata) ve müşteri ilişkileri yönetimi (CRM) işbirliğinin pazarlama iletişimi stratejilerindeki rolü: büyük ölçekli özel bir banka örneği. Stratejik ve Sosyal Araştırmalar Dergisi, 3(2), 289-310. doi: 10.30692/sisad.574133 CR - Klemmensen, R., Hobolt, S.B. ve Hansen, M.E. (2007). Estimating policy positions using political texts: an evaluation of the wordscores approach. Electoral Studies, 26(4), 746-755. doi:10.1016/j.electstud.2007.07.006 CR - Konşuk Ünlü, H. (2022). Başlığında “data science” ifadesi geçen uluslararası kongrelerde sunulan bildiri özetlerinin metin madenciliği yöntemleri ile incelenmesi. Nicel Bilimler Dergisi, 4(1), 1-21. doi:10.51541/nicel.1075225 CR - Kroon, A.C., van der Meer, T. ve Vliegenthart, R. (2022). Beyond counting words: assessing performance of dictionaries, supervised machine learning, and embeddings in topic and frame classification. Computational Communication Research, 4(2), 528-570. doi:10.5117/CCR2022.2.006.KROO CR - Monroe, B.L. ve Schrodt, P.A. (2008). Introduction to the special issue: the statistical analysis of political text. Political Analysis, 16(4), 351-355. doi: 10.1093/pan/mpn017 CR - Montgomery, J.M. ve Olivella, S. (2018). Tree-Based Models for Political Science Data. American Journal of Political Science, 62(3), 729-744. doi: 10.1111/ajps.12361 CR - Nayak, A. ve Natarajan, D. (2016). Comparative study of naive Bayes, support vector machine and random forest classifiers in sentiment analysis of twitter feeds. International Journal of Advance Studies in Computer Science and Engineering (IJASCSE), 5(1), 16. Erişim adresi: https://rb.gy/964f1h CR - Nelson, L.K. (2020). Computational grounded theory: a methodological framework. Sociological Methods & Research, 49(1), 3-42. doi: 10.1177/0049124117729703 CR - Neuendorf, K.A. (2004). Content analysis: a contrast and complement to discourse analysis. Qualitative methods, 2(1), 33-36. Erişim adresi: https://zenodo.org/record/998700 CR - Neuendorf, K.A. (2017). The content analysis guidebook. New Delhi: SAGE. CR - Nguyen, V.A., Boyd-Graber, J., Resnik, P. ve Miler, K. (2015). Tea party in the house: a hierarchical ideal point topic model and its application to republican legislators in the 112th congress. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 1438-1448. CR - Onan, A. (2020). Evrişimli sinir ağı mimarilerine dayalı türkçe duygu analizi. Avrupa Bilim ve Teknoloji Dergisi, 374-380. doi: 10.31590/ejosat.780609 CR - Osgood, C.E. (1959). Representational model ve relevant research methods. In I. Pool (Ed.), Trends in content analysis (ss. 33-38). Urbana, IL : Illinois Press. CR - Osisanwo, F.Y., Akinsola, J.E.T., Awodele, O., Hinmikaiye, J.O., Olakanmi, O. ve Akinjobi, J. (2017). Supervised machine learning algorithms: classification and comparison. International Journal of Computer Trends and Technology (IJCTT), 48(3), 128-138. doi:10.14445/22312803/IJCTT-V48P126 CR - Özyiğit, H. (2022). Muhasebe alanına güncel yaklaşımlar: metin madenciliği. Muhasebe ve Vergi Uygulamaları Dergisi, 15(3), 637-663. doi: 10.29067/muvu.1104525 CR - Özoran, B.A. (2022). Bir halkla ilişkiler aracı olarak twitter: dünya sağlık örgütü paylaşımlarının içerik analizi ve metin madenciliği ile incelenmesi. Celal Bayar Üniversitesi Sosyal Bilimler Dergisi, 20(04), 125-146. doi: 10.18026/cbayarsos.1083191 CR - Quinn, K.M., Monroe, B.L., Colaresi, M., Crespin, M.H. ve Radev, D.R. (2010). How to analyze political attention with minimal assumptions and costs. American Journal of Political Science, 54(1), 209-228. doi: 10.1111/j.1540-5907.2009.00427.x. CR - Peterson, A. ve Spirling, A. (2018). Classification accuracy as a substantive quantity of interest: measuring polarization in westminster systems. Political Analysis, 26(1), 120-128. doi:10.1017/pan.2017.39 CR - Polat, H. ve Körpe, M. (2018). TBMM genel kurul tutanaklarından yakın anlamlı kavramların çıkarılması. Bilişim Teknolojileri Dergisi, 11(3), 235-244. doi: 10.17671/gazibtd.402468 CR - Rheault, L. ve Cochrane, C. (2020). Word embeddings for the analysis of ideological placement in parliamentary corpora. Political Analysis, 28(1), 112-133. doi: 10.1017/pan.2019.26. CR - Roberts, C.W. (Ed.). (2020). Text analysis for the social sciences: methods for drawing statistical inferences from texts and transcripts. New York: Routledge. CR - Rodman, E. (2020). A timely intervention: tracking the changing meanings of political concepts with word vectors. Political Analysis, 28(1), 87-111. doi: 10.1017/pan.2019.23. CR - Rodriguez, P. L. ve Spirling, A. (2022). Word embeddings: what works, what doesn’t, and how to tell the difference for applied research. The Journal of Politics, 84(1), 101-115. doi:10.1086/715162. CR - Sagarzazu, I. ve Klüver, H. (2017). Coalition governments and party competition: political communication strategies of coalition parties. Political Science Research and Methods, 5(2), 333-349. doi: 10.1017/psrm.2015.56 CR - Sanders, J., Lisi, G. ve Schonhardt-Bailey, C. (2017). Themes and topics in parliamentary oversight hearings: a new direction in textual data analysis. Statistics, Politics and Policy, 8(2), 153-194. doi: 10.1515/spp-2017-0012 CR - Schoonvelde, M., Schumacher, G. ve Bakker, B.N. (2019). Friends with text as data benefits: assessing and extending the use of automated text analysis in political science and political psychology. Journal of Social and Political Psychology, 7(1), 124-143. doi:10.5964/jspp.v7i1.964 CR - Shrestha, A. ve Spezzano, F. (2021). Textual characteristics of news title and body to detect fake news: a reproducibility study. Advances in Information Retrieval: 43rd European Conference on IR Research, 43, 120-133. doi: 10.1007/978-3-030-72240-1_9 CR - Silge, J. ve Robinson, D. (2016). tidytext: Text mining and analysis using tidy data principles in R. Journal of Open Source Software, 1(3), 37. doi: 10.21105/joss.00037 CR - Slapin, J.B. ve Proksch, S.O. (2008). A scaling model for estimating time‐series party positions from texts. American Journal of Political Science, 52(3), 705-722. doi: 10.1111/j.1540-5907.2008.00338.x CR - Spirling, A. (2012). US treaty making with American Indians: Institutional change and relative power, 1784–1911. American Journal of Political Science, 56(1), 84-97. doi: 10.1111/j.1540-5907.2011.00558.x CR - Şahinaslan, Ö., Dalyan, H. ve Şahinaslan, E. (2022). Naive bayes sınıflandırıcısı kullanılarak youtube verileri üzerinden çok dilli duygu analizi. Bilişim Teknolojileri Dergisi, 15(2), 221-229. doi: 10.17671/gazibtd.999960 CR - Tumasjan, A., Sprenger, T., Sandner, P. ve Welpe, I. (2010). Predicting elections with twitter: what 140 characters reveal about political sentiment. Proceedings of the international AAAI conference on web and social media, 4(1), 178-185. doi: 10.1609/icwsm.v4i1.14009 CR - Uslu, O. ve Özmen-Akyol, S. (2021). Türkçe haber metinlerinin makine öğrenmesi yöntemleri kullanılarak sınıflandırılması. Eskişehir Türk Dünyası Uygulama ve Araştırma Merkezi Bilişim Dergisi, 2(1), 15-20. Erişim adresi: https://dergipark.org.tr/en/download/article-file/1483397 CR - Van Loon, A. (2022). Three families of automated text analysis. Social Science Research, 108, 102798. doi: 10.1016/j.ssresearch.2022.102798 CR - Vasiliev, Y. (2020). Natural language processing with Python and spaCy: A practical introduction. San Francisco: No Starch Press. CR - Wesley, J.J. (2014). The qualitative analysis of political documents. Bertie Kaal, Isa Maks ve Annemarie van Elfrinkhof (Ed.), From text to political positions: text analysis across disciplines (ss.135-160) içinde. Amsterdam: John Benjamins CR - Wilkerson, J. ve Casas, A. (2017). Large-scale computerized text analysis in political science: opportunities and challenges. Annual Review of Political Science, 20, 529-544. doi: 10.1146/annurev-polisci-052615-025542 CR - Young, L. ve Soroka, S. (2012). Affective news: the automated coding of sentiment in political texts. Political Communication, 29(2), 205-231. doi: 10.1080/10584609.2012.671234 CR - Yu, B., Kaufmann, S. ve Diermeier, D. (2008). Classifying party affiliation from political speech. Journal of Information Technology & Politics, 5(1), 33-48. doi:10.1080/19331680802149608 CR - Zanini, N. ve Dhawan, V. (2015). Text Mining: an introduction to theory and some applications. Research Matters, 19, 38-45. Erişim adresi: https://rb.gy/q4rwu5 UR - https://doi.org/10.17218/hititsbd.1260739 L1 - https://dergipark.org.tr/tr/download/article-file/2992175 ER -