Research Article
BibTex RIS Cite
Year 2021, Volume: 10 Issue: 4, 119 - 137, 31.12.2021

Abstract

References

  • K. Berger, J. Klier, M. Klier, and F. Probst. “A review of information systems research on online social networks”, Communications of the association for Information Systems, Vol.35, pp. 145-172, September 2014.
  • Y. A. Modi and I. S. Gandhi. “Internet sociology: Impact of Facebook addiction on the lifestyle and other recreational activities of the Indian youth”, Proceedings of the The International Conferences on Socio-Cultural, Anthropology, Criminology and International Relations, Jakarta, Indonesia, pp. 1-4, 14-16 October 2013.
  • Anoynmous, “The number of worldwide social network users”, Statistica Research Department, [Online], Available: https:// www.statista.com/statistics/278414/number-ofworldwide- social-network-users/, 2021.
  • J. Zhang and S. Y. Philip. “Broad Learning Through Fusions”, Springer International Publishing, Switzerland , 2019
  • K. Alemerien. “Usable Security and Privacy on Online Social Networks: Tools, Approaches, Studies, and Future Trends”, International Journal of Software Innovation (IJSI), Vol.9, No.2, pp. 35-68, 2021.
  • Y. Li, Y. Peng, W. Ji, Z. Zhang, and Q. Xu. “User identification based on display names across online social networks”, IEEE Access, Vol.5, pp. 17342-17353, 25 August 2017.
  • V. Cosenza. “World map of social networks”, Vincos Blog, [Online], Available: https://vincos.it/world-map-of-socialnetworks/, 2021
  • D. Gayo Avello. “All liaisons are dangerous when all your friends are known to us”, Proceedings of the ACM Conference on Hypertext and hypermedia, Eindhoven, Netherlands, pp. 171-180, 6-9 June 2011.
  • M. Kiranmayi and N. Maheswari. “A review on privacy preservation of social networks using graphs”, Journal of Applied Security Research, Vol.16, No.2, pp. 190-223, 23 April 2020.
  • O. Coban, A. Inan, and S. A. Ozel. “Your Username Can Give You Away: Matching Turkish OSN Users with Usernames”, International Journal of Information Security Science, Vol.10, pp. 1-15, March 2021.
  • O. Coban, A. Inan, and S. A. Ozel. “Privacy Risk Analysis for Facebook Users”, Proceedings of the IEEE Signal Processing and Communications Applications Conference, Gaziantep, Turkey, pp. 1-4, 5-7 October 2020.
  • D. Choi, Y. Lee, S. Kim, and P. Kang. “Private attribute inference from Facebook’s public text metadata: a case study of Korean users”, Industrial Management & Data Systems, Vol.117, pp. 1687-1706, September 2017.
  • O. Coban, A. Inan, and S. A. Ozel. “Facebook Tells Me Your Gender: An Exploratory Study of Gender Prediction for Turkish Facebook Users”, Transactions on Asian and Low-Resource Language Information Processing, Vol.20, No.4, pp. 1-38, May 2021.
  • Y. Kilic and A. Inan. “Implementing A Web Crawler With An Attacker Perspective On A Professional Purpose Online Social Network”, Proceedings of the International Conference on All Aspects of Cyber Security, Adana, Turkey, pp. 27-32, 25 October 2019.
  • J. Lindamood, R. Heatherly, M. Kantarcioglu, and B. Thuraisingham. “Inferring private information using social network data”, Proceedings of the 18th international Conference on World wide web, Madrid, Spain, pp. 1145- 1146, 20-24 April 2009.
  • C. Tang, K. Ross, N. Saxena, and R. Chen. “What’s in a name: A study of names, gender inference, and gender behavior in facebook”, Proceedings of the International Conference on Database Systems for Advanced Applications, Hong Kong, pp. 344-356, 22-25 April 2011.
  • C. Cadwalladr and E. Graham-Harrison. “Revealed: 50 million Facebook profiles harvested for Cambridge Analytica in major data breach”, The Guardian, [Online]. Available: https://www.theguardian.com/technology/2018/apr/08/facebookto- contact-the-87-million-users- affected-by-data-breach, 2018.
  • S. Nick. “Maker of popular quiz apps on Facebook exposed data of 120 million users”, [Online]. Available: https:// www.theverge.com/2018/6/28/17514822/facebook-dataleak- quiz-app-nametests-social-sweetheartexposed-userinfo, 2018.
  • J. Garside. “Twitter puts trillions of tweets up for sale to data miners”, The Guardian. [Online]. Available: https:// www.theguardian.com/technology/2015/mar/18/twitterputs- trillions-tweets-for-sale-data-miners, 2015.
  • D. Uberti. “Facebook Says Leak of 533 Million Users’ Data Wasn’t a Hack. Does it Matter?”, The Wall Setreet Journal, [Online]. Available: https://www.wsj.com/articles/facebook-says-leak-of- 533-million-users-data-wasnt-a-hack-does-it-matter- 11617910106, 2021.
  • O. Coban, A. Inan, and S. A. Ozel. “Towards the design and implementation of an OSN crawler: A case of Turkish Facebook users”, International Journal of Information Security Science, Vol.9, pp. 76-93, June 2020.
  • O. Coban, A. Inan, and S. A. Ozel. “Inverse document frequency-based sensitivity scoring for privacy analysis”, Signal, Image and Video Processing, pp. 1-9, August 2021. 136 INTERNATIONAL JOURNAL OF INFORMATION SECURITY SCIENCE Ö. Çoban, Vol.10, No.4, pp.119-137
  • O. Coban, A. Inan, and S. A. Ozel. “Fine-grained Kinship Detection for Facebook Users based on Wall Contents”, Proceedings of the IEEE Innovations in Intelligent Systems and Applications Conference, Elazığ, Turkey, pp. 1- 4, October 2021.
  • O. Kulcu and T. Henkoglu. “Privacy in social networks: An analysis of Facebook”, International Journal of Information Management, Vol.34, pp. 761-769, December 2014.
  • E. Avllazagaj, E. Ayday, and A. E. Cicek. “Privacy- Related Consequences of Turkish Citizen Database Leak”, Proccedings of the International Network for Economic Research Conference, Darmstadt, Germany, pp. 1-18, 8- 10 June 2016.
  • E. Kahya-Ozyirmidokuz. “Analyzing unstructured Facebook social network data through web text mining: A study of online shopping firms in Turkey”, Information Development, Vol.32, pp. 70-80, January 2016.
  • O. Coban, S. A. Ozel, and A. Inan. “Deep Learningbased Sentiment Analysis of Facebook Data: The Case of Turkish Users”, The Computer Journal, Vol.64, pp. 473- 499, January 2021.
  • O. Coban, B. Ozyer, and G. T. Ozyer. “Sentiment analysis for Turkish Twitter feeds”, Proceedings of the IEEE Signal Processing and Communications Applications Conference, Malatya, Turkey, pp. 2388–2391, 16-19 May 2015.
  • H.A. Shehu, M. H. Sharif, M. H. U. Sharif, R. Datta, S. Tokat, S. Uyaver, and R. A. Ramadan. “Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter Data”, IEEE Access, Vol.9, pp. 56836-56854, April 2021.
  • H. Karayigit, C. I. Aci, and A. Akdagli. “Detecting abusive Instagram comments in Turkish using convolutional Neural network and machine learning methods”, Expert Systems with Applications, Vol.174, pp. 1-15, July 2021.
  • C. Coltekin. “A corpus of Turkish offensive language on social media”, Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France, pp. 6174- 6184, 11-16 May 2020.
  • S. A. Ozel, E. Sarac, S. Akdemir, and H. Aksu. “Detection of cyberbullying on social media messages in Turkish”, Proceedings of the IEEE International Conference on Computer Science and Engineering, Antalya, Turkey, pp. 366-370, 5-8 October 2017.
  • A. Bozyigit, S. Utku, and E. Nasibov. “Cyberbullying detection: Utilizing social media features”, Expert Systems with Applications, Vol.179, pp. 1-12, October 2021.
  • O. Ozdikis, P. Senkul, and H. Oguztuzun. “Semantic expansion of tweet contents for enhanced event detection in twitter”, Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Istanbul, Turkey, pp. 20–24, 26-29 August 2012.
  • D. Kucuk. “Sentiment, Stance, and Intent Detection in Turkish Tweets”, In New Opportunities for Sentiment Analysis and Information Processing, IGI Global Inc., USA, 2021.
  • M. Kaya, G. Fidan, and I. H. Toroslu. “Sentiment analysis of Turkish political news”, Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Macau, China, pp. 174–180, 4-7 December 2012.
  • M. Ciot, M. Sonderegger, and D. Ruths, D. “Gender inference of Twitter users in non-English contexts”, Proceedings of the Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA, pp. 136-1145, 18-21 October 2013.
  • E. Sezerer, O. Polatbilek, and S. Tekir. “Gender prediction from Turkish tweets with neural networks”, Proceedings of the IEEE Signal Processing and Communications Applications Conference, Sivas, Turkey, pp. 1-4, 24-26 April 2019.
  • E. Sezerer, O. Polatbilek, and S. Tekir. “A Turkish Dataset for Gender Identification of Twitter Users”, Proceedings of the Linguistic Annotation Workshop@ ACL, Florence, Italy, pp. 203–207, 1-2 August 2019.
  • M. Talebi and C. Kose. “Identifying gender, age and education level by analyzing comments on Facebook”, Proceedings of the IEEE Signal Processing and Communications Applications Conference, Haspolat, Turkey, pp. 1–4, 24-26 April 2013.
  • O. Celik and A.F. Aslan. “Gender prediction from social media comments with artificial intelligence”, Sakarya Universitesi Fen Bilimleri Enstitusu Dergisi, Vol.23, pp. 1256–1264, December 2019.
  • J. Peters. “Personal data of 533 million Facebook users leaks online”, The Verge, [Online]. Available: https://www.theverge.com/2021/4/4/22366822/facebookpersonal- data-533-million-leaks-online-email-phonenumbers, 2021.
  • I. Baskin. “A database of Turkish person names”, Github, [Online]. Available: https://gist.github.com/ismailbaskin/ 1325813/9157dd8ced294a11218449d43bf9f772780f5d85
  • Anonymous, “A database of cities and district names of Turkey”, Github, [Online]. Available: https://gist.github.com/rainb3rry/ 6bbf945118362b1509adb46d95bca30c
  • A. Amir, P. Charalampopoulos, S. P. Pissis, and J. Radoszewski. “Dynamic and internal longest common substring”, Algorithmica, Vol.82, pp. 3707–3743, July 2020.
  • H. A. Schwartz, J. C. Eichstaedt, M. L. Kern, L. Dziurzynski, S. M. Ramones, M. Agrawal, and L. H. Ungar. “Personality, gender, and age in the language of social media: The open-vocabulary approach”, PloS one, Vol.8, pp. 1-16, September 2013.

An Exploratory Analysis of Leaked Facebook Data: A Case of Turkish Users

Year 2021, Volume: 10 Issue: 4, 119 - 137, 31.12.2021

Abstract

In this study, an analysis is performed on recently leaked data of Turkish Facebook users to inspect their sharing behavior along with how accurately an adversary can perform attacks to learn the gender and username of a user. Experimental results show that the majority of users do not disclose their sensitive data except for phone numbers. Users mostly live in big cities, but privacy-aware users mostly live in the eastern and southeastern parts of the country. It is also possible to infer gender with very high accuracy up to 0.95 just using the first name and username of a user.

References

  • K. Berger, J. Klier, M. Klier, and F. Probst. “A review of information systems research on online social networks”, Communications of the association for Information Systems, Vol.35, pp. 145-172, September 2014.
  • Y. A. Modi and I. S. Gandhi. “Internet sociology: Impact of Facebook addiction on the lifestyle and other recreational activities of the Indian youth”, Proceedings of the The International Conferences on Socio-Cultural, Anthropology, Criminology and International Relations, Jakarta, Indonesia, pp. 1-4, 14-16 October 2013.
  • Anoynmous, “The number of worldwide social network users”, Statistica Research Department, [Online], Available: https:// www.statista.com/statistics/278414/number-ofworldwide- social-network-users/, 2021.
  • J. Zhang and S. Y. Philip. “Broad Learning Through Fusions”, Springer International Publishing, Switzerland , 2019
  • K. Alemerien. “Usable Security and Privacy on Online Social Networks: Tools, Approaches, Studies, and Future Trends”, International Journal of Software Innovation (IJSI), Vol.9, No.2, pp. 35-68, 2021.
  • Y. Li, Y. Peng, W. Ji, Z. Zhang, and Q. Xu. “User identification based on display names across online social networks”, IEEE Access, Vol.5, pp. 17342-17353, 25 August 2017.
  • V. Cosenza. “World map of social networks”, Vincos Blog, [Online], Available: https://vincos.it/world-map-of-socialnetworks/, 2021
  • D. Gayo Avello. “All liaisons are dangerous when all your friends are known to us”, Proceedings of the ACM Conference on Hypertext and hypermedia, Eindhoven, Netherlands, pp. 171-180, 6-9 June 2011.
  • M. Kiranmayi and N. Maheswari. “A review on privacy preservation of social networks using graphs”, Journal of Applied Security Research, Vol.16, No.2, pp. 190-223, 23 April 2020.
  • O. Coban, A. Inan, and S. A. Ozel. “Your Username Can Give You Away: Matching Turkish OSN Users with Usernames”, International Journal of Information Security Science, Vol.10, pp. 1-15, March 2021.
  • O. Coban, A. Inan, and S. A. Ozel. “Privacy Risk Analysis for Facebook Users”, Proceedings of the IEEE Signal Processing and Communications Applications Conference, Gaziantep, Turkey, pp. 1-4, 5-7 October 2020.
  • D. Choi, Y. Lee, S. Kim, and P. Kang. “Private attribute inference from Facebook’s public text metadata: a case study of Korean users”, Industrial Management & Data Systems, Vol.117, pp. 1687-1706, September 2017.
  • O. Coban, A. Inan, and S. A. Ozel. “Facebook Tells Me Your Gender: An Exploratory Study of Gender Prediction for Turkish Facebook Users”, Transactions on Asian and Low-Resource Language Information Processing, Vol.20, No.4, pp. 1-38, May 2021.
  • Y. Kilic and A. Inan. “Implementing A Web Crawler With An Attacker Perspective On A Professional Purpose Online Social Network”, Proceedings of the International Conference on All Aspects of Cyber Security, Adana, Turkey, pp. 27-32, 25 October 2019.
  • J. Lindamood, R. Heatherly, M. Kantarcioglu, and B. Thuraisingham. “Inferring private information using social network data”, Proceedings of the 18th international Conference on World wide web, Madrid, Spain, pp. 1145- 1146, 20-24 April 2009.
  • C. Tang, K. Ross, N. Saxena, and R. Chen. “What’s in a name: A study of names, gender inference, and gender behavior in facebook”, Proceedings of the International Conference on Database Systems for Advanced Applications, Hong Kong, pp. 344-356, 22-25 April 2011.
  • C. Cadwalladr and E. Graham-Harrison. “Revealed: 50 million Facebook profiles harvested for Cambridge Analytica in major data breach”, The Guardian, [Online]. Available: https://www.theguardian.com/technology/2018/apr/08/facebookto- contact-the-87-million-users- affected-by-data-breach, 2018.
  • S. Nick. “Maker of popular quiz apps on Facebook exposed data of 120 million users”, [Online]. Available: https:// www.theverge.com/2018/6/28/17514822/facebook-dataleak- quiz-app-nametests-social-sweetheartexposed-userinfo, 2018.
  • J. Garside. “Twitter puts trillions of tweets up for sale to data miners”, The Guardian. [Online]. Available: https:// www.theguardian.com/technology/2015/mar/18/twitterputs- trillions-tweets-for-sale-data-miners, 2015.
  • D. Uberti. “Facebook Says Leak of 533 Million Users’ Data Wasn’t a Hack. Does it Matter?”, The Wall Setreet Journal, [Online]. Available: https://www.wsj.com/articles/facebook-says-leak-of- 533-million-users-data-wasnt-a-hack-does-it-matter- 11617910106, 2021.
  • O. Coban, A. Inan, and S. A. Ozel. “Towards the design and implementation of an OSN crawler: A case of Turkish Facebook users”, International Journal of Information Security Science, Vol.9, pp. 76-93, June 2020.
  • O. Coban, A. Inan, and S. A. Ozel. “Inverse document frequency-based sensitivity scoring for privacy analysis”, Signal, Image and Video Processing, pp. 1-9, August 2021. 136 INTERNATIONAL JOURNAL OF INFORMATION SECURITY SCIENCE Ö. Çoban, Vol.10, No.4, pp.119-137
  • O. Coban, A. Inan, and S. A. Ozel. “Fine-grained Kinship Detection for Facebook Users based on Wall Contents”, Proceedings of the IEEE Innovations in Intelligent Systems and Applications Conference, Elazığ, Turkey, pp. 1- 4, October 2021.
  • O. Kulcu and T. Henkoglu. “Privacy in social networks: An analysis of Facebook”, International Journal of Information Management, Vol.34, pp. 761-769, December 2014.
  • E. Avllazagaj, E. Ayday, and A. E. Cicek. “Privacy- Related Consequences of Turkish Citizen Database Leak”, Proccedings of the International Network for Economic Research Conference, Darmstadt, Germany, pp. 1-18, 8- 10 June 2016.
  • E. Kahya-Ozyirmidokuz. “Analyzing unstructured Facebook social network data through web text mining: A study of online shopping firms in Turkey”, Information Development, Vol.32, pp. 70-80, January 2016.
  • O. Coban, S. A. Ozel, and A. Inan. “Deep Learningbased Sentiment Analysis of Facebook Data: The Case of Turkish Users”, The Computer Journal, Vol.64, pp. 473- 499, January 2021.
  • O. Coban, B. Ozyer, and G. T. Ozyer. “Sentiment analysis for Turkish Twitter feeds”, Proceedings of the IEEE Signal Processing and Communications Applications Conference, Malatya, Turkey, pp. 2388–2391, 16-19 May 2015.
  • H.A. Shehu, M. H. Sharif, M. H. U. Sharif, R. Datta, S. Tokat, S. Uyaver, and R. A. Ramadan. “Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter Data”, IEEE Access, Vol.9, pp. 56836-56854, April 2021.
  • H. Karayigit, C. I. Aci, and A. Akdagli. “Detecting abusive Instagram comments in Turkish using convolutional Neural network and machine learning methods”, Expert Systems with Applications, Vol.174, pp. 1-15, July 2021.
  • C. Coltekin. “A corpus of Turkish offensive language on social media”, Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France, pp. 6174- 6184, 11-16 May 2020.
  • S. A. Ozel, E. Sarac, S. Akdemir, and H. Aksu. “Detection of cyberbullying on social media messages in Turkish”, Proceedings of the IEEE International Conference on Computer Science and Engineering, Antalya, Turkey, pp. 366-370, 5-8 October 2017.
  • A. Bozyigit, S. Utku, and E. Nasibov. “Cyberbullying detection: Utilizing social media features”, Expert Systems with Applications, Vol.179, pp. 1-12, October 2021.
  • O. Ozdikis, P. Senkul, and H. Oguztuzun. “Semantic expansion of tweet contents for enhanced event detection in twitter”, Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Istanbul, Turkey, pp. 20–24, 26-29 August 2012.
  • D. Kucuk. “Sentiment, Stance, and Intent Detection in Turkish Tweets”, In New Opportunities for Sentiment Analysis and Information Processing, IGI Global Inc., USA, 2021.
  • M. Kaya, G. Fidan, and I. H. Toroslu. “Sentiment analysis of Turkish political news”, Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Macau, China, pp. 174–180, 4-7 December 2012.
  • M. Ciot, M. Sonderegger, and D. Ruths, D. “Gender inference of Twitter users in non-English contexts”, Proceedings of the Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA, pp. 136-1145, 18-21 October 2013.
  • E. Sezerer, O. Polatbilek, and S. Tekir. “Gender prediction from Turkish tweets with neural networks”, Proceedings of the IEEE Signal Processing and Communications Applications Conference, Sivas, Turkey, pp. 1-4, 24-26 April 2019.
  • E. Sezerer, O. Polatbilek, and S. Tekir. “A Turkish Dataset for Gender Identification of Twitter Users”, Proceedings of the Linguistic Annotation Workshop@ ACL, Florence, Italy, pp. 203–207, 1-2 August 2019.
  • M. Talebi and C. Kose. “Identifying gender, age and education level by analyzing comments on Facebook”, Proceedings of the IEEE Signal Processing and Communications Applications Conference, Haspolat, Turkey, pp. 1–4, 24-26 April 2013.
  • O. Celik and A.F. Aslan. “Gender prediction from social media comments with artificial intelligence”, Sakarya Universitesi Fen Bilimleri Enstitusu Dergisi, Vol.23, pp. 1256–1264, December 2019.
  • J. Peters. “Personal data of 533 million Facebook users leaks online”, The Verge, [Online]. Available: https://www.theverge.com/2021/4/4/22366822/facebookpersonal- data-533-million-leaks-online-email-phonenumbers, 2021.
  • I. Baskin. “A database of Turkish person names”, Github, [Online]. Available: https://gist.github.com/ismailbaskin/ 1325813/9157dd8ced294a11218449d43bf9f772780f5d85
  • Anonymous, “A database of cities and district names of Turkey”, Github, [Online]. Available: https://gist.github.com/rainb3rry/ 6bbf945118362b1509adb46d95bca30c
  • A. Amir, P. Charalampopoulos, S. P. Pissis, and J. Radoszewski. “Dynamic and internal longest common substring”, Algorithmica, Vol.82, pp. 3707–3743, July 2020.
  • H. A. Schwartz, J. C. Eichstaedt, M. L. Kern, L. Dziurzynski, S. M. Ramones, M. Agrawal, and L. H. Ungar. “Personality, gender, and age in the language of social media: The open-vocabulary approach”, PloS one, Vol.8, pp. 1-16, September 2013.
There are 46 citations in total.

Details

Primary Language English
Subjects Computer Software
Journal Section Research Article
Authors

Önder Çoban 0000-0001-9404-2583

Publication Date December 31, 2021
Submission Date November 9, 2021
Published in Issue Year 2021 Volume: 10 Issue: 4

Cite

IEEE Ö. Çoban, “An Exploratory Analysis of Leaked Facebook Data: A Case of Turkish Users”, IJISS, vol. 10, no. 4, pp. 119–137, 2021.