Research Article
BibTex RIS Cite

Clustering Court Orders Using Machine Learning Methods

Year 2023, Volume: Vol:8 Issue: Issue:2, 148 - 158, 20.12.2023
https://doi.org/10.53070/bbd.1318518

Abstract

Artificial intelligence (AI) is a rapidly evolving technology that has found applications in various fields of life. It has also made its way into the legal domain. AI has numerous applications in the legal realm, including legal research, case management, legal consultancy, legal language analysis, case precedents analysis, and legal risk assessment. Natural language processing (NLP) techniques have been employed to develop various AI applications in the legal field. Text clustering is one such application. Text clustering is a technique used in NLP and machine learning to group similar texts based on their content or linguistic features. Given the complexity and vastness of legal texts, clustering methods provide valuable contributions. These methods aid in grouping cases with similar attributes in a specific subject area, thus helping us better understand legal principles and judicial trends. Clustering techniques offer advantages such as facilitating quick access to a wide range of cases for legal researchers and improving the legal analysis process. Furthermore, the outcomes of clustering can be utilized in diverse areas, including the development of legal strategies, pre-trial preparations, and substantiating legal decisions. In this study, decisions of the Dispute Resolution Court were subjected to natural language processing using the TF-IDF method, followed by clustering using AI techniques such as CURE, K-MEANS, DBSCAN, AGNES, AFFINITY, and BIRCH. Based on evaluation metrics, the BIRCH algorithm yielded the best results.

References

  • Aizawa, A. (2003). An information-theoretic perspective of tf–idf measures. Information Processing & Management, 39(1), 45–65. https://doi.org/10.1016/S0306-4573(02)00021-3
  • Altszyler, E., Ribeiro, S., Sigman, M., & Fernández Slezak, D. (2017). The interpretation of dream meaning: Resolving ambiguity using Latent Semantic Analysis in a small corpus of text. Consciousness and Cognition, 56, 178–187. https://doi.org/10.1016/j.concog.2017.09.004
  • Anders, K.-H. (2003). Anders, K. H. (2003, April). A hierarchical graph-clustering approach to find groups of objects. 5th Workshop on Progress in Automated Map Generalization, 1–8.
  • Ay, S. (2018). Türkiye’deki Ceza Davalarının İstatistiksel Analizi. İstanbul Ticaret Üniversitesi Sosyal Bilimler Dergisi, 17(33), 25–36.
  • Aydın, Ö. (2020). Mobbing İçerikli Yargı Kararlarının Makine Öğrenmesi Algoritmaları ile Sınıflandırılması [Yayımlanmış Yüksek Lisans Tezi]. Balıkesir Üniversitesi Fen Bilimleri Enstitüsü.
  • Bateni, M., Behnezhad, S., Derakhshan, M., Hajiaghayi, M., Kiveris, R., Lattanzi, S., & Mirrokni, V. (2017). Affinity clustering: Hierarchical clustering at scale. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 30 (pp. 1–11). Guha, S., Rastogi, R., & Shim, K. (2001). Cure: an efficient clustering algorithm for large databases. Information Systems, 26(1), 35–58. https://doi.org/10.1016/S0306-4379(01)00008-4
  • Han, J., Kamber, M., & Pei, J. (2011). Data Mining. Concepts and Techniques, 3rd Edition (The Morgan Kaufmann Series in Data Management Systems).
  • Hosseini, S., & Varzaneh, Z. A. (2022). Deep text clustering using stacked AutoEncoder. Multimedia Tools and Applications, 81(8), 10861–10881. https://doi.org/10.1007/s11042-022-12155-0
  • Kılıç, B., & Öner, Y. (2021). Yargıtay Kararlarının Suç Türlerine Göre Makine Öğrenmesi Yöntemleri İle Sınıflandırılması (Vol. 4, Issue 3). https://dergipark.org.tr/en/download/article-file/2032425
  • Kodinariya, T. M., & Makwana, P. R. (2013). Review on determining number of Cluster in K-Means Clustering. International Journal of Advance Research in Computer Science and Management Studies, 1(6), 90–95.
  • Lovmar, L., Ahlford, A., Jonsson, M., & Syvänen, A.-C. (2005). Silhouette scores for assessment of SNP genotype clusters. BMC Genomics, 6(1), 35. https://doi.org/10.1186/1471-2164-6-35
  • Lukasik, S., Kowalski, P. A., Charytanowicz, M., & Kulczycki, P. (2016). Clustering using flower pollination algorithm and Calinski-Harabasz index. 2016 IEEE Congress on Evolutionary Computation (CEC), 2724–2728. https://doi.org/10.1109/CEC.2016.7744132
  • Mumcuoğlu, E., Öztürk, C. E., Ozaktas, H. M., & Koç, A. (2021). Natural language processing in law: Prediction of outcomes in the higher courts of Turkey. Information Processing and Management, 58(5). https://doi.org/10.1016/j.ipm.2021.102684
  • Ogbuabor, G., & F. N, U. (2018). Clustering Algorithm for a Healthcare Dataset Using Silhouette Score Value. International Journal of Computer Science and Information Technology, 10(2), 27–37. https://doi.org/10.5121/ijcsit.2018.10203
  • Petrovic, S. (2006). A Comparison Between The Silhouette Index And The Davies-Bouldin Index in Labelling Ids Clusters. 11th Nordic Workshop of Secure IT Systems, 53–64.
  • Prameswari, P., Zulkarnain, Surjandari, I., & Laoh, E. (2017). Mining online reviews in Indonesia’s priority tourist destinations using sentiment analysis and text summarization approach. 2017 IEEE 8th International Conference on Awareness Science and Technology (ICAST), 121–126. https://doi.org/10.1109/ICAwST.2017.8256429
  • Ramadhani, S., Azzahra, D., & Z, T. (2022). Comparison of K-Means and K-Medoids Algorithms in Text Mining based on Davies Bouldin Index Testing for Classification of Student’s Thesis. Digital Zone: Jurnal Teknologi Informasi Dan Komunikasi, 13(1), 24–33. https://doi.org/10.31849/digitalzone.v13i1.9292
  • Rashid, J., Shah, S. M. A., & Irtaza, A. (2020). An Efficient Topic Modeling Approach for Text Mining and Information Retrieval through K-means Clustering. Mehran University Research Journal of Engineering and Technology, 39(1), 213–222. https://doi.org/10.22581/muet1982.2001.20
  • Rehman, S. U., Asghar, S., Fong, S., & Sarasvady, S. (2014). DBSCAN: Past, present and future. The Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014), 232–238. https://doi.org/10.1109/ICADIWT.2014.6814687
  • Shahapure, K. R., & Nicholas, C. (2020). Cluster Quality Analysis Using Silhouette Score. 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), 747–748. https://doi.org/10.1109/DSAA49011.2020.00096
  • Singh, A. K., Mittal, S., Malhotra, P., & Srivastava, Y. V. (2020). Clustering Evaluation by Davies-Bouldin Index(DBI) in Cereal data using K-Means. 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC), 306–310. https://doi.org/10.1109/ICCMC48092.2020.ICCMC-00057
  • Suchacka, G., Cabri, A., Rovetta, S., & Masulli, F. (2021). Efficient on-the-fly Web bot detection. Knowledge-Based Systems, 223, 107074. https://doi.org/10.1016/j.knosys.2021.107074
  • Wang, X., & Xu, Y. (2019). An improved index for clustering validation based on Silhouette index and Calinski-Harabasz index. IOP Conference Series: Materials Science and Engineering, 569(5), 052024. https://doi.org/10.1088/1757-899X/569/5/052024
  • Zhang, T., Ramakrishnan, R., & Livny, M. (1997). BIRCH: A New Data Clustering Algorithm and Its Applications. Data Mining and Knowledge Discovery, 1(2), 141–182. https://doi.org/10.1023/A:1009783824328
  • Zhao, Q., & Fränti, P. (2014). WB-index: A sum-of-squares based index for cluster validity. Data & Knowledge Engineering, 92, 77–89. https://doi.org/10.1016/j.datak.2014.07.008

Makine Öğrenmesi Yöntemleri Kullanılarak Mahkeme Kararlarlarının Kümelenmesi

Year 2023, Volume: Vol:8 Issue: Issue:2, 148 - 158, 20.12.2023
https://doi.org/10.53070/bbd.1318518

Abstract

Yapay zeka son yıllarda hızlı bir şekilde gelişen bir teknolojidir ve hayatın hemen her alanında uygulanma olanağı bulmuştur. Sağlık, otomotiv, eğitim, müzik, finans, tarım ve daha birçok alanda yapay zeka kullanılmaya başlanmıştır. Bu alanlardan biri de hukuktur. Hukuk aleminde yapay zekanın birçok uygulanma ortamı bulunmaktadır. Hukuk araştırmaları, dava yönetimi, hukuk danışmanlığı, hukuki dil analizi, içtihat taramaları, hukuki risk analizi gibi yardımcı araç olarak kullanımının yanında yargısal kararların analizi gibi kullanımları da mevcuttur. Yapay zeka hukuk alanında doğal dil işleme teknolojisi kullanılarak birçok uygulama geliştirilmiştir. Metin kümeleme bu uygulama alanlarından biridir. Metin kümeleme, doğal dil işleme ve makine öğrenmesinde kullanılan bir tekniktir ve içerik veya dilbilimsel özelliklerine göre benzer metinleri gruplandırmaya yardımcı olmaktadır. Özellikle hukuk alanında karmaşık ve geniş bir metin kümesi olduğundan kümeleme yöntemleri değerli bir katkı sunmaktadır. Bu yöntemler, belirli bir konuda benzer niteliklere sahip davaları gruplandırarak, hukuki prensipleri ve yargısal eğilimleri daha iyi anlamamıza yardımcı olmaktadır. Kümeleme yöntemleri, hukuki araştırmacıların geniş bir dava yelpazesine hızlı bir şekilde erişmelerini sağlaması ve hukuki analiz sürecini iyileştirmesi gibi avantajlar sunmaktadır. Ayrıca, kümeleme sonuçları, hukuki stratejilerin geliştirilmesi, dava öncesi hazırlık ve hukuki kararların temellendirilmesi gibi birçok farklı alanda kullanılmaktadır. Bu çalışmada Uyuşmazlık Mahkemesi kararları TF-IDF yöntemi ile doğal dil işleme sürecinden geçirilmiş ve ardından CURE, K-MEANS, DBSCAN, AGNES, AFFINITY ve BIRCH gibi yapay zeka yöntemleri ile kümelenmiştir. Değerlendirme metriklerine göre en iyi sonucu BIRCH algoritmasının verdiği görülmüştür.

References

  • Aizawa, A. (2003). An information-theoretic perspective of tf–idf measures. Information Processing & Management, 39(1), 45–65. https://doi.org/10.1016/S0306-4573(02)00021-3
  • Altszyler, E., Ribeiro, S., Sigman, M., & Fernández Slezak, D. (2017). The interpretation of dream meaning: Resolving ambiguity using Latent Semantic Analysis in a small corpus of text. Consciousness and Cognition, 56, 178–187. https://doi.org/10.1016/j.concog.2017.09.004
  • Anders, K.-H. (2003). Anders, K. H. (2003, April). A hierarchical graph-clustering approach to find groups of objects. 5th Workshop on Progress in Automated Map Generalization, 1–8.
  • Ay, S. (2018). Türkiye’deki Ceza Davalarının İstatistiksel Analizi. İstanbul Ticaret Üniversitesi Sosyal Bilimler Dergisi, 17(33), 25–36.
  • Aydın, Ö. (2020). Mobbing İçerikli Yargı Kararlarının Makine Öğrenmesi Algoritmaları ile Sınıflandırılması [Yayımlanmış Yüksek Lisans Tezi]. Balıkesir Üniversitesi Fen Bilimleri Enstitüsü.
  • Bateni, M., Behnezhad, S., Derakhshan, M., Hajiaghayi, M., Kiveris, R., Lattanzi, S., & Mirrokni, V. (2017). Affinity clustering: Hierarchical clustering at scale. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 30 (pp. 1–11). Guha, S., Rastogi, R., & Shim, K. (2001). Cure: an efficient clustering algorithm for large databases. Information Systems, 26(1), 35–58. https://doi.org/10.1016/S0306-4379(01)00008-4
  • Han, J., Kamber, M., & Pei, J. (2011). Data Mining. Concepts and Techniques, 3rd Edition (The Morgan Kaufmann Series in Data Management Systems).
  • Hosseini, S., & Varzaneh, Z. A. (2022). Deep text clustering using stacked AutoEncoder. Multimedia Tools and Applications, 81(8), 10861–10881. https://doi.org/10.1007/s11042-022-12155-0
  • Kılıç, B., & Öner, Y. (2021). Yargıtay Kararlarının Suç Türlerine Göre Makine Öğrenmesi Yöntemleri İle Sınıflandırılması (Vol. 4, Issue 3). https://dergipark.org.tr/en/download/article-file/2032425
  • Kodinariya, T. M., & Makwana, P. R. (2013). Review on determining number of Cluster in K-Means Clustering. International Journal of Advance Research in Computer Science and Management Studies, 1(6), 90–95.
  • Lovmar, L., Ahlford, A., Jonsson, M., & Syvänen, A.-C. (2005). Silhouette scores for assessment of SNP genotype clusters. BMC Genomics, 6(1), 35. https://doi.org/10.1186/1471-2164-6-35
  • Lukasik, S., Kowalski, P. A., Charytanowicz, M., & Kulczycki, P. (2016). Clustering using flower pollination algorithm and Calinski-Harabasz index. 2016 IEEE Congress on Evolutionary Computation (CEC), 2724–2728. https://doi.org/10.1109/CEC.2016.7744132
  • Mumcuoğlu, E., Öztürk, C. E., Ozaktas, H. M., & Koç, A. (2021). Natural language processing in law: Prediction of outcomes in the higher courts of Turkey. Information Processing and Management, 58(5). https://doi.org/10.1016/j.ipm.2021.102684
  • Ogbuabor, G., & F. N, U. (2018). Clustering Algorithm for a Healthcare Dataset Using Silhouette Score Value. International Journal of Computer Science and Information Technology, 10(2), 27–37. https://doi.org/10.5121/ijcsit.2018.10203
  • Petrovic, S. (2006). A Comparison Between The Silhouette Index And The Davies-Bouldin Index in Labelling Ids Clusters. 11th Nordic Workshop of Secure IT Systems, 53–64.
  • Prameswari, P., Zulkarnain, Surjandari, I., & Laoh, E. (2017). Mining online reviews in Indonesia’s priority tourist destinations using sentiment analysis and text summarization approach. 2017 IEEE 8th International Conference on Awareness Science and Technology (ICAST), 121–126. https://doi.org/10.1109/ICAwST.2017.8256429
  • Ramadhani, S., Azzahra, D., & Z, T. (2022). Comparison of K-Means and K-Medoids Algorithms in Text Mining based on Davies Bouldin Index Testing for Classification of Student’s Thesis. Digital Zone: Jurnal Teknologi Informasi Dan Komunikasi, 13(1), 24–33. https://doi.org/10.31849/digitalzone.v13i1.9292
  • Rashid, J., Shah, S. M. A., & Irtaza, A. (2020). An Efficient Topic Modeling Approach for Text Mining and Information Retrieval through K-means Clustering. Mehran University Research Journal of Engineering and Technology, 39(1), 213–222. https://doi.org/10.22581/muet1982.2001.20
  • Rehman, S. U., Asghar, S., Fong, S., & Sarasvady, S. (2014). DBSCAN: Past, present and future. The Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014), 232–238. https://doi.org/10.1109/ICADIWT.2014.6814687
  • Shahapure, K. R., & Nicholas, C. (2020). Cluster Quality Analysis Using Silhouette Score. 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), 747–748. https://doi.org/10.1109/DSAA49011.2020.00096
  • Singh, A. K., Mittal, S., Malhotra, P., & Srivastava, Y. V. (2020). Clustering Evaluation by Davies-Bouldin Index(DBI) in Cereal data using K-Means. 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC), 306–310. https://doi.org/10.1109/ICCMC48092.2020.ICCMC-00057
  • Suchacka, G., Cabri, A., Rovetta, S., & Masulli, F. (2021). Efficient on-the-fly Web bot detection. Knowledge-Based Systems, 223, 107074. https://doi.org/10.1016/j.knosys.2021.107074
  • Wang, X., & Xu, Y. (2019). An improved index for clustering validation based on Silhouette index and Calinski-Harabasz index. IOP Conference Series: Materials Science and Engineering, 569(5), 052024. https://doi.org/10.1088/1757-899X/569/5/052024
  • Zhang, T., Ramakrishnan, R., & Livny, M. (1997). BIRCH: A New Data Clustering Algorithm and Its Applications. Data Mining and Knowledge Discovery, 1(2), 141–182. https://doi.org/10.1023/A:1009783824328
  • Zhao, Q., & Fränti, P. (2014). WB-index: A sum-of-squares based index for cluster validity. Data & Knowledge Engineering, 92, 77–89. https://doi.org/10.1016/j.datak.2014.07.008
There are 25 citations in total.

Details

Primary Language Turkish
Subjects Machine Learning (Other)
Journal Section PAPERS
Authors

Muhammed Burak Görentaş 0000-0001-8898-9631

Taner Uçkan 0000-0001-5385-6775

Publication Date December 20, 2023
Submission Date June 22, 2023
Acceptance Date October 6, 2023
Published in Issue Year 2023 Volume: Vol:8 Issue: Issue:2

Cite

APA Görentaş, M. B., & Uçkan, T. (2023). Makine Öğrenmesi Yöntemleri Kullanılarak Mahkeme Kararlarlarının Kümelenmesi. Computer Science, Vol:8(Issue:2), 148-158. https://doi.org/10.53070/bbd.1318518

The Creative Commons Attribution 4.0 International License 88x31.png is applied to all research papers published by JCS and

A Digital Object Identifier (DOI) Logo_TM.png is assigned for each published paper