TY - JOUR T1 - Big Data, Data Mining, Machine Learning, and Deep Learning Concepts in Crime Data AU - Ateş, Emre Cihan AU - Bostancı, Gazi Erkan AU - Msg, Serdar PY - 2020 DA - November DO - 10.26650/JPLC2020-813328 JF - Ceza Hukuku ve Kriminoloji Dergisi PB - İstanbul Üniversitesi WT - DergiPark SN - 2148-6646 SP - 293 EP - 319 VL - 8 IS - 2 LA - en AB - Along with the rapid change of information technologies and the widespread use of the internet over time, data stacks with ample diversity and quite large volumes has emerged. The use of data mining is increasing day by day due to the huge part it plays in the acquisition of information by making necessary analyses especially within a large amount of data. Obtaining accurate information is a key factor affecting decision-making processes. Crime data is included among the application areas of data mining, being one of the data stacks which is rapidly growing with each passing day. Crime events constitute unwanted behaviour in every society. For this reason, it is important to extract meaningful information from crime data. This article aims to provide an overview of the use of data mining and machine learning in crime data and to give a new perspective on the decision-making processes by presenting examples of the use of data mining for a crime. For this purpose, some examples of data mining and machine learning in crime and security areas are presented by giving a conceptual framework in the subject of big data, data mining, machine learning, and deep learning along with task types, processes, and methods. KW - Crime KW - Big data KW - Data mining KW - Security KW - Policing CR - Abdullah, N., Ismail, S. A., Sophiayati, S., & Sam, S. M. (2015). Data quality in big data: a review. International Journal of Advances in Soft Computing & Its Applications, 7(3). google scholar CR - Adamović, S., Miškovic, V., Maček, N., Milosavljević, M., Šarac, M., Saračević, M., & Gnjatović, M. (2020). An efficient novel approach for iris recognition based on stylometric features and machine learning techniques. Future Generation Computer Systems, 107, 144-157. google scholar CR - Adewumi, A. O., & Akinyelu, A. A. (2017). A survey of machine-learning and nature-inspired based credit card fraud detection techniques. International Journal of System Assurance Engineering and Management, 8(2), 937-953. google scholar CR - Aggarwal, C. C. (2018). Machine learning for text. Cham: Springer International Publishing. google scholar CR - Agrahari, A., & Rao, D. (2017). A review paper on Big Data: technologies, tools and trends. Int Res J Eng Technol, 4(10), 640-649. google scholar CR - Agu, S. C., Ajah, I., & Ibe, W. E. (2019). Impact of Human Character and Information System on Corruption Risk in Nigeria. International Journal of Scientific Research and Engineering Development, 2(4), 481-485. google scholar CR - Ahmed, A. (2020). “From Data to Wisdom” Using Machine Learning Capabilities in Accounting and Finance Professionals. Talent Development & Excellence, 12. google scholar CR - Ajans Press, 2017. [Online]. Available: https://www.cnnturk.com/bilim-teknoloji/ turkiye-cep-telefonuylakonusmada-avrupa-birincisi?page=1 (accessed 5.12.18). google scholar CR - Akdemir, N., & Lawless, C. J. (2020). Exploring the human factor in cyber-enabled and cyber-dependent crime victimisation: a lifestyle routine activities approach. Internet Research, 30(6), 1665-1687. google scholar CR - Aledhari, M., Di Pierro, M., Hefeida, M., & Saeed, F. (2018). A deep learning-based data minimization algorithm for fast and secure transfer of big genomic datasets. IEEE Transactions on Big Data. google scholar CR - Arora, S., Bhatia, M. P. S., & Kukreja, H. (2020, February). A Multimodal Biometric System for Secure User Identification Based on Deep Learning. In International Congress on Information and Communication Technology (pp. 95-103). Springer, Singapore. google scholar CR - Arshad, H., Jantan, A., & Omolara, E. (2019). Evidence collection and forensics on social networks: Research challenges and directions. Digital Investigation, 28, 126-138. google scholar CR - Ateş, E.C., Bostanci, E., & Guzel, M. S. (2020). Security Evaluation of Industry 4.0: Understanding Industry 4.0 on the Basis of Crime, Big Data, Internet Of Thing (IoT) and Cyber Physical Systems. Güvenlik Bilimleri Dergisi, (International Security Congress Special Issue), 29-50. google scholar CR - Ayre, L. B., & Craner, J. (2017). Open data: What it is and why you should care. Public Library Quarterly, 36(2), 173-184. google scholar CR - Beniwal, S., & Arora, J. (2012). Classification and feature selection techniques in data mining. International journal of engineering research & technology (IJERT), 1(6), 1-6. google scholar CR - Berk, R. (2017). An impact assessment of machine learning risk forecasts on parole board decisions and recidivism. Journal of Experimental Criminology, 13(2), 193-216. google scholar CR - Berkhin, P. (2006). A survey of clustering data mining techniques. In Grouping multidimensional data (pp. 25- 71). Springer, Berlin, Heidelberg. google scholar CR - Bhuyan, M. H., Saharia, S., & Bhattacharyya, D. K. (2012). An effective method for fingerprint classification. arXiv preprint arXiv:1211.4658. google scholar CR - Blei, D. M., & Smyth, P. (2017). Science and data science. Proceedings of the National Academy of Sciences, 114(33), 8689-8692. google scholar CR - Bock, F. E., Aydin, R. C., Cyron, C. J., Huber, N., Kalidindi, S. R., & Klusemann, B. (2019). A review of the application of machine learning and data mining approaches in continuum materials mechanics. Frontiers in Materials, 6, 110. google scholar CR - Bode, J. (2019, June). Every Contact Leaves a Trace: A Literary Reality of Locard’s Exchange Principle. In Outside the Box: A Multi-Lingual Forum (p. 18). google scholar CR - Bostanci, E. (2015). 3D reconstruction of crime scenes and design considerations for an interactive investigation tool. arXiv preprint arXiv:1512.03156. google scholar CR - Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society, 15(5), 662-679. google scholar CR - Bulgakova, E., Bulgakov, V., Trushchenkov, I., Vasilev, D., & Kravets, E. (2019). Big data in investigating and preventing crimes. In Big Data-driven World: Legislation Issues and Control Technologies (pp. 61-69). Springer, Cham. google scholar CR - Campbell, C., & Ying, Y. (2011). Learning with support vector machines. Synthesis lectures on artificial intelligence and machine learning, 5(1), 1-95. google scholar CR - Ch, R., Gadekallu, T. R., Abidi, M. H., & Al-Ahmari, A. (2020). Computational System to Classify Cyber Crime Offenses Using Machine Learning. Sustainability, 12(10), 4087. google scholar CR - Chan, J., & Bennett Moses, L. (2017). Making sense of big data for security. The British journal of criminology, 57(2), 299-319. google scholar CR - Chau, D. H., Pandit, S., & Faloutsos, C. (2006, September). Detecting fraudulent personalities in networks of online auctioneers. In European Conference on Principles of Data Mining and Knowledge Discovery (pp. 103-114). Springer, Berlin, Heidelberg. google scholar CR - Chen, M., Mao, S., & Liu, Y. (2014). Big data: A survey. Mobile networks and applications, 19(2), 171-209. google scholar CR - Clarke, C. (2006). Proactive policing: Standing on the shoulders of community‐based policing. Police Practice and Research, 7(1), 3-17. google scholar CR - Commission, (2017). Kriminalistik. Gendarmerie and Coast Guard Academy, Ankara. google scholar CR - Cooper, P. (2017). Data, information, knowledge and wisdom. Anaesthesia & Intensive Care Medicine, 18(1), 55-56. google scholar CR - Dey, A. (2016). Machine learning algorithms: a review. International Journal of Computer Science and Information Technologies, 7(3), 1174-1179. google scholar CR - Feng, M., Zheng, J., Ren, J., Hussain, A., Li, X., Xi, Y., & Liu, Q. (2019). Big data analytics and mining for effective visualization and trends forecasting of crime data. IEEE Access, 7, 106111-106123. google scholar CR - Ge, Z., Song, Z., Ding, S. X., & Huang, B. (2017). Data mining and analytics in the process industry: The role of machine learning. Special Section On Data-Driven Monitoring, Fault Diagnosis and Control Of CyberPhysical Systems, 5, 20590-20616. google scholar CR - Ghorbani, R., & Ghousi, R. (2019). Predictive data mining approaches in medical diagnosis: A review of some diseases prediction. International Journal of Data and Network Science, 3(2), 47-70. google scholar CR - Guenther, A.J. (2012). Role of Social Media in Law Enforcement Significant and Growing [Online]. Available: http://www.lexisnexis.com/en-us/about-us/media/press-release.page?id =1342623085481181 google scholar CR - Gupta, M. K., & Chandra, P. (2019, March). A comparative study of clustering algorithms. In 2019 6th International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 801-805). IEEE. google scholar CR - Hand, D. J., & Adams, N. M. (2014). Data Mining. Wiley StatsRef: Statistics Reference Online, 1-7. google scholar CR - Hassani, H., Huang, X., Silva, E. S., & Ghodsi, M. (2016). A review of data mining applications in crime. Statistical Analysis and Data Mining: The ASA Data Science Journal, 9(3), 139-154. google scholar CR - He, L., Páez, A., Jiao, J., An, P., Lu, C., Mao, W., & Long, D. (2020). Ambient Population and Larceny-Theft: A Spatial Analysis Using Mobile Phone Data. ISPRS International Journal of Geo-Information, 9(6), 342. google scholar CR - Heickerö, R. (2014). Cyber terrorism: Electronic jihad. Strategic Analysis, 38(4), 554-565. google scholar CR - Hey, J. (2004). The data, information, knowledge, wisdom chain: the metaphorical link. Intergovernmental Oceanographic Commission, 26, 1-18. google scholar CR - Jackson, J. (2002). Data mining; a conceptual overview. Communications of the Association for Information Systems, 8(1), 19. google scholar CR - Kelleher, J. D., & Tierney, B. (2018). Data science. MIT Press. google scholar CR - Khare, A. R., & Shrivasta, P. (2018). Data mining for the internet of things. In Exploring the Convergence of Big Data and the Internet of Things (pp. 181-191). IGI Global. google scholar CR - Koyuncugil, A. S., & Özgülbaş, N. (2009). Veri madenciliği: Tıp ve sağlık hizmetlerinde kullanımı ve uygulamaları. İnternational Journal Of Informatics Technologies, 2(2). google scholar CR - Kumar, R., & Nagpal, B. (2019). Analysis and prediction of crime patterns using big data. International Journal of Information Technology, 11(4), 799-805. google scholar CR - Lau, P. Y., & Fung, W. K. (2020). Evaluation of marker selection methods and statistical models for chronological age prediction based on DNA methylation. Legal Medicine, 47, 101744. google scholar CR - Lei, C. (2019). Legal Control over Big Data Criminal Investigation. Social Sciences in China, 40(3), 189-204. google scholar CR - Li, X., Liu, B., & Philip, S. Y. (2006, September). Discovering overlapping communities of named entities. In European Conference on Principles of Data Mining and Knowledge Discovery (pp. 593-600). Springer, Berlin, Heidelberg. google scholar CR - Martinovic, I., Rasmussen, K., Roeschlin, M., & Tsudik, G. (2017). Authentication using pulse-response biometrics. Communications of the ACM, 60(2), 108-115. google scholar CR - Mcclendon, L., & Meghanathan, N. (2015). Using machine learning algorithms to analyze crime data. Machine Learning and Applications: An International Journal (MLAIJ), 2(1), 1-12. google scholar CR - McCue, C. (2014). Data mining and predictive analysis: Intelligence gathering and crime analysis. ButterworthHeinemann. google scholar CR - Mesgarpour, M., & Dickinson, I. (2014). Enhancing the value of commercial vehicle telematics data through analytics and optimisation techniques. Archives of Transport System Telematics, 7. google scholar CR - Mistek, E., Fikiet, M. A., Khandasammy, S. R., & Lednev, I. K. (2018). Toward locard’s exchange principle: recent developments in forensic trace evidence analysis. Analytical chemistry, 91(1), 637-654. google scholar CR - Mittal, M., Goyal, L. M., Hemanth, D. J., & Sethi, J. K. (2019). Clustering approaches for high‐dimensional databases: A review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(3), e1300. google scholar CR - Mittal, M., Goyal, L. M., Sethi, J. K., & Hemanth, D. J. (2018). Monitoring the impact of economic crisis on crime in India using machine learning. Computational Economics, 53(4), 1467-1485. google scholar CR - Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S., & Coello, C. A. C. (2013). A survey of multiobjective evolutionary algorithms for data mining: Part I. IEEE Transactions on Evolutionary Computation, 18(1), 4-19. google scholar CR - Muneer, A., & Fati, S. M. (2020). A Comparative Analysis of Machine Learning Techniques for Cyberbullying Detection on Twitter. Future Internet, 12(11), 187. google scholar CR - Ngai, E. W., Xiu, L., & Chau, D. C. (2009). Application of data mining techniques in customer relationship management: A literature review and classification. Expert systems with applications, 36(2), 2592-2602. google scholar CR - Odia, J. O., & Akpata, O. T. (2020). Role of Data Science and Data Analytics in Forensic Accounting and Fraud Detection. In Handbook of Research on Engineering, Business, and Healthcare Applications of Data Science and Analytics (pp. 203-227). IGI Global. google scholar CR - Olson, D. L., & Lauhoff, G. (2019). Descriptive data mining. In Descriptive Data Mining (pp. 129-130). Springer, Singapore. google scholar CR - Pandey R.K., Zhou Y., Kota B.U., Govindaraju V. (2017) Learning Representations for Cryptographic Hash Based Face Template Protection. In: Bhanu B., Kumar A. (eds) Deep Learning for Biometrics. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-61657-5_11 google scholar CR - Pandya, B., Cosma, G., Alani, A. A., Taherkhani, A., Bharadi, V., & McGinnity, T. M. (2018, May). Fingerprint classification using a deep convolutional neural network. In 2018 4th International Conference on Information Management (ICIM) (pp. 86-91). IEEE. google scholar CR - Pauleen, D. J., Rooney, D., & Intezari, A. (2017). Big data, little wisdom: trouble brewing? Ethical implications for the information systems discipline. Social Epistemology, 31(4), 400-416. google scholar CR - Perols, J. (2011). Financial statement fraud detection: An analysis of statistical and machine learning algorithms. Auditing: A Journal of Practice & Theory, 30(2), 19-50. google scholar CR - Power, D. J. (2016). “Big Brother” can watch us. Journal of Decision systems, 25(sup1), 578-588. google scholar CR - Quick, D., & Choo, K. K. R. (2016). Big forensic data reduction: digital forensic images and electronic evidence. Cluster Computing, 19(2), 723-740. google scholar CR - Ristea, A., Al Boni, M., Resch, B., Gerber, M. S., & Leitner, M. (2020). Spatial crime distribution and prediction for sporting events using social media. International Journal of Geographical Information Science, 1-32. google scholar CR - Roy, A., Sun, J., Mahoney, R., Alonzi, L., Adams, S., & Beling, P. (2018, April). Deep learning detecting fraud in credit card transactions. In 2018 Systems and Information Engineering Design Symposium (SIEDS) (pp. 129-134). IEEE. google scholar CR - Rutkowski, L., Jaworski, M., & Duda, P. (2020). Stream data mining: algorithms and their probabilistic properties. Cham: Springer. google scholar CR - Shao, L., Duan, Y., Sun, X., Gao, H., Zhu, D., & Miao, W. (2017, July). Answering Who/When, What, How, Why through Constructing Data Graph, Information Graph, Knowledge Graph and Wisdom Graph. In SEKE (pp. 1-6). google scholar CR - Snaphaan, T., & Hardyns, W. (2019). Environmental criminology in the big data era. European Journal of Criminology, 1477370819877753. google scholar CR - Song, G., Bernasco, W., Liu, L., Xiao, L., Zhou, S., & Liao, W. (2019). Crime feeds on legal activities: Daily mobility flows help to explain thieves’ target location choices. Journal of Quantitative Criminology, 35(4), 831-854. google scholar CR - Srinivas, K., Rani, B. K., & Govrdhan, A. (2010). Applications of data mining techniques in healthcare and prediction of heart attacks. International Journal on Computer Science and Engineering (IJCSE), 2(02), 250-255. google scholar CR - Steenbruggen, J., Tranos, E., & Nijkamp, P. (2015). Data from mobile phone operators: A tool for smarter cities?. Telecommunications Policy, 39(3-4), 335-346. google scholar CR - Stewart, L. (2019). Big Data Discrimination: Maintaining Protection of Individual Privacy without Disincentivizing Businesses’ Use of Biometric Data to Enhance Security. BCL Rev., 60, 349. google scholar CR - Sundararajan, K., & Woodard, D.L. (2018). Deep Learning for Biometrics: A Survey. ACM Comput. Surv. 51(3), DOI:https://doi.org/10.1145/3190618. google scholar CR - Tassone, C., Martini, B., & Choo, K. K. (2017). Forensic visualization: survey and future research directions. In Contemporary Digital Forensic Investigations of Cloud and Mobile Applications (pp. 163-184). Syngress. google scholar CR - Tilley, N., & Sidebottom, A. (2017). Handbook of crime prevention and community safety. Routledge. google scholar CR - Tirgari, V. (2012). Information technology policies and procedures against unstructured data: A phenomenological study of information technology professionals. Journal of Management Information and Decision Sciences, 15(2), 87. google scholar CR - Tiwari, S., Chourasia, J. N., & Chourasia, V. S. (2015). A review of advancements in biometric systems. International Journal of Innovative Research in Advanced Engineering, 2(1), 187-204. google scholar CR - Traunmueller, M., Quattrone, G., & Capra, L. (2014, November). Mining mobile phone data to investigate urban crime theories at scale. In International Conference on Social Informatics (pp. 396-411). Springer, Cham. google scholar CR - Turkish Ministry of Justice. (2020). Judicial Statistics 2019. [Online]. Available: https://adlisicil.adalet.gov.tr/ Resimler/SayfaDokuman/1092020162733adalet_ist-2019.pdf (accessed 10.09.20). google scholar CR - Turing, A., (1950). Computing machinery and intelligence: Mind, 59, 433–460. google scholar CR - TÜİK (Turkish Statistical Institute). (2020). Adrese Dayalı Nüfus Kayıt Sistemi Sonuçları, 2019. [Online]. Available: https://adlisicil.adalet.gov.tr/Resimler/SayfaDokuman /1092020162733adalet_ist-2019.pdf (accessed 10.09.20). google scholar CR - Umair, S., Muhammad, S., Amna, U., Aniqa, M., Abdul, B.S., Sheikh, K.R., (2015). Application of Machine learning Algorithms in Crime Classification and Classification Rule Mining. Res. J. Recent Sci. (pp. 106–114). google scholar CR - Uliyan, D. M., Sadeghi, S., & Jalab, H. A. (2020). Anti-spoofing method for fingerprint recognition using patch based deep learning machine. Engineering Science and Technology, an International Journal, 23(2), 264-273. google scholar CR - Vaidhyanathan, S., & Bulock, C. (2014). Knowledge and dignity in the era of “big data”. The Serials Librarian, 66(1-4), 49-64. google scholar CR - Wamba, S. F., Akter, S., Edwards, A., Chopin, G., & Gnanzou, D. (2015). How ‘big data’can make big impact: Findings from a systematic review and a longitudinal case study. International Journal of Production Economics, 165, 234-246. google scholar CR - Wang, H., Kifer, D., Graif, C., & Li, Z. (2016, August). Crime rate inference with big data. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 635-644). google scholar Wani, M. A., Bhat, F. A., Afzal, S., & Khan, A. I. (2020). Supervised Deep Learning in Fingerprint Recognition. In Advances in Deep Learning (pp. 111-132). Springer, Singapore. google scholar Wearesocials & Hootsuide, (2018). Digital in 2018: World’s ınternet users pass the 4 billion mark. URL https:// wearesocial.com/blog/2018/01/global-digital-report-2018 (accessed 16.02.20). google scholar Wheeler, A. P., & Steenbeek, W. (2020). Mapping the risk terrain for crime using machine learning. Journal of Quantitative Criminology, 1-36. google scholar CR - White, M. (2012). Digital workplaces: Vision and reality. Business information review, 29(4), 205-214. google scholar CR - Williams, G. J. (2009). Rattle: a data mining GUI for R. The R Journal, 1(2), 45-55. google scholar CR - Williams, M. L., Burnap, P., Javed, A., Liu, H., & Ozalp, S. (2020). Hate in the machine: anti-Black and antiMuslim social media posts as predictors of offline racially and religiously aggravated crime. The British Journal of Criminology, 60(1), 93-117. google scholar CR - Wilson, D. B., McClure, D., & Weisburd, D. (2010). Does forensic DNA help to solve crime? The benefit of sophisticated answers to naive questions. Journal of Contemporary Criminal Justice, 26(4), 458-469. google scholar CR - Win, K. N., Li, K., Chen, J., Viger, P. F., & Li, K. (2020). Fingerprint classification and identification algorithms for criminal investigation: A survey. Future Generation Computer Systems, 110, 758-771. google scholar CR - Xu, H. (2020). Big data challenges in genomics. In Handbook of Statistics (Vol. 43, pp. 337-348). Elsevier google scholar CR - Yao, S., Wei, M., Yan, L., Wang, C., Dong, X., Liu, F., & Xiong, Y. (2020, August). Prediction of Crime Hotspots based on Spatial Factors of Random Forest. In 2020 15th International Conference on Computer Science & Education (ICCSE) (pp. 811-815). IEEE. google scholar CR - Yavanoglu, U., Colak, M., Caglar, B., Cakir, S., Milletsever, O., & Sagiroglu, S. (2013, December). Intelligent approach for identifying political views over social networks. In 2013 12th International Conference on Machine Learning and Applications (Vol. 2, pp. 281-287). IEEE. google scholar CR - Yoo, J. S. (2019, December). Crime data warehousing and crime pattern discovery. In Proceedings of the Second International Conference on Data Science, E-Learning and Information Systems (pp. 1-6). google scholar CR - Zins, C. (2007). Conceptual approaches for defining data, information, and knowledge. Journal of the American society for information science and technology, 58(4), 479-493. google scholar UR - https://doi.org/10.26650/JPLC2020-813328 L1 - https://dergipark.org.tr/tr/download/article-file/1354184 ER -