Legal text classification in Turkey: A machine learning approach to divorce and zoning decisions

Tülay Turan; Ecir Uğur Küçüksille

doi:10.70669/ijedt.1491511

Araştırma Makalesi

Legal text classification in Turkey: A machine learning approach to divorce and zoning decisions

Yıl 2024, Cilt: 6 Sayı: 2, 53 - 63, 20.12.2024

Tülay Turan , Ecir Uğur Küçüksille

https://doi.org/10.70669/ijedt.1491511

Öz

The increasing volume of legal data in recent years requires integrating artificial intelligence (AI) techniques for efficient management and use. Critical challenges include classifying legal texts into specific fields or topics. This is crucial to advancing legal research and practice. This article aims to categorically classify Turkish court decisions, an area that has yet to be adequately researched before, compared to classification studies in international law texts. The study aims to contribute significantly to developing artificial intelligence-supported solutions to guide Turkish legal decisions by dividing legal texts into specific areas, thus increasing the efficiency and accessibility of the legal system. The study first created a data set consisting of divorce and zoning cases. Then, basic models were established with K-Nearest Neighbor (KNN), Support Vector Machines (SVM), Decision Trees (DT), and Random Forests (RF) algorithms to determine the algorithm that would classify the cases most effectively. Hyperparameter optimization was performed for each model to increase the Base Model performances. This process was supported by the 10-fold cross-validation method. Improved models were established with the hyperparameter values obtained as a result of optimization. As a result of comparative analysis, the SVM model had an impressive 90% accuracy rate in classifying legal texts. This result will significantly contribute to the development of intelligent legal systems by achieving significant success in classifying legal texts in Turkey.

Anahtar Kelimeler

Legal Text Classification , Turkish Court Decisions , Machine Learning Algorithms , Hyperparameter Optimization , SVM

Kaynakça

Alarie, B., Niblett, A., Yoon, A. H. (2018). How artificial intelligence will affect the practice of law. University of Toronto Law Journal, 68(1): 106-124.
Aletras, N., Tsarapatsanis, D., Preoţiuc-Pietro, D., Lampos, V. (2016). Predicting judicial decisions of the European Court of Human Rights: A natural language processing perspective. PeerJ computer science, 2, e93.
Alhakeem, Z. M., Jebur, Y. M., Henedy, S. N., Imran, H., Bernardo, L. F., Hussein, H. M. (2022). Prediction of ecofriendly concrete compressive strength using gradient boosting regression tree combined with GridSearchCV hyperparameter-optimization techniques. Materials, 15(21): 7432.
Ali, J., Khan, R., Ahmad, N., Maqsood, I. (2012). Random forests and decision trees. International Journal of Computer Science Issues (IJCSI), 9(5): 272.
Ashley, K. D., Brüninghaus, S. (2009). Automatically classifying case texts and predicting outcomes. Artificial Intelligence and Law, 17:125-165.
Awad, M., Khanna, R., Awad, M., Khanna, R. (2015). Support vector machines for classification. Efficient learning machines: Theories, concepts, and applications for engineers and system designers, 39-66.
Aydemir, E. (2023). Estimation of Turkish Constitutional Court Decisions in Terms of Admissibility with NLP. In 2023 IV International Conference on Neural Networks and Neurotechnologies (NeuroNT), IEEE, pp. 17-20.
Bafna, P., Pramod, D., Vaidya, A. (2016). Document clustering: TF-IDF approach. In 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), IEEE, pp. 61-66.
Benedetto, I., Sportelli, G., Bertoldo, S., Tarasconi, F., Cagliero, L., Giacalone, G. (2023). On the use of Pretrained Language Models for Legal Italian Document Classification. Procedia Computer Science, 225: 2244-2253.
Boella, G., Di Caro, L., Humphreys, L. (2011). Using classification to support legal knowledge engineers in the Eunomos legal document management system. In Fifth international workshop on Juris-informatics (JURISIN).
Breiman, L. (2001). Random forests. Machine learning, 45: 5-32.
Chalkidis, I., Androutsopoulos, I., Aletras, N. (2019). Neural legal judgment prediction in English. arXiv preprint arXiv:1906.02059.
Charbuty, B., Abdulazeez, A. (2021). Classification based on decision tree algorithm for machine learning. Journal of Applied Science and Technology Trends, 2(01): 20-28.
Chen, H., Wu, L., Chen, J., Lu, W., Ding, J. (2022). A comparative study of automated legal text classification using random forests and deep learning. Information Processing & Management, 59(2), 102798.
Cui, J., Shen, X., Wen, S. (2023). A survey on legal judgment prediction: Datasets, metrics, models and challenges. IEEE Access.
Cutler, D. R., Edwards Jr, T. C., Beard, K. H., Cutler, A., Hess, K. T., Gibson, J., Lawler, J. J. (2007). Random forests for classification in ecology. Ecology, 88(11):2783-2792.
Eliot, L. (2020). Legal judgment prediction (ljp) amid the advent of autonomous ai legal reasoning. arXiv preprint arXiv:2009.14620.
Fatourechi, M., Ward, R. K., Mason, S. G., Huggins, J., Schlögl, A., Birch, G. E. (2008). Comparison of evaluation metrics in classification applications with imbalanced datasets. In 2008 seventh international conference on machine learning and applications, IEEE, pp. 777-782.
Feurer, M., Hutter, F. (2019). Hyperparameter optimization. Automated machine learning: Methods, systems, challenges, 3-33.
Görentaş, M. B., Uçkan, T., Arli, N. B. (2023). Uyuşmazlık Mahkemesi Kararlarının Makine Öğrenmesi Yöntemleri ile Sınıflandırılması. Yüzüncü Yıl Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 28(3): 947-961.
Gunn, S. R. (1998). Support vector machines for classification and regression. ISIS technical report, 14(1): 5-16.
Gupta, B., Rawat, A., Jain, A., Arora, A., Dhami, N. (2017). Analysis of various decision tree algorithms for classification in data mining. International Journal of Computer Applications, 163(8): 15-19.
Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems and their applications, 13(4): 18-28.
Hirschberg, J., Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245): 261-266.
Hossin, M., Sulaiman, M. N. (2015). A review on evaluation metrics for data classification evaluations. International journal of data mining & knowledge management process, 5(2): 1.
Kalia, A., Kumar, N., Namdev, N. (2022). Classifying case facts and predicting legal decisions of the indian central information commission: a natural language processing approach. In Advances in Deep Learning, Artificial Intelligence and Robotics: Proceedings of the 2nd International Conference on Deep Learning, Artificial Intelligence and Robotics, (ICDLAIR), pp. 35-45.
Kartini, D., Nugrahadi, D. T., Farmadi, A. (2021). Hyperparameter tuning using GridsearchCV on the comparison of the activation function of the ELM method to the classification of pneumonia in toddlers. In 2021 4th International Conference of Computer and Informatics Engineering (IC2IE), IEEE., pp. 390-395.
Kaur, A., Bozic, B. (2019). Convolutional Neural Network-based Automatic Prediction of Judgments of the European Court of Human Rights. In AICS, pp. 458-469.
Laaksonen, J., Oja, E. (1996). Classification with learning k-nearest neighbors. In Proceedings of international conference on neural networks (ICNN’96) IEEE, pp. 1480-1483.
Lei, M., Ge, J., Li, Z., Li, C., Zhou, Y., Zhou, X., Luo, B. (2017). Automatically classify chinese judgment documents utilizing machine learning algorithms. In Database Systems for Advanced Applications: DASFAA 2017 International Workshops: BDMS, BDQM, SeCoP, and DMMOOC, Suzhou, China, March 27-30, 2017, Proceedings 22, pp. 3-17.
Li, Q., Peng, H., Li, J., Xia, C., Yang, R., Sun, L., He, L. (2020). A survey on text classification: From shallow to deep learning. arXiv preprint arXiv:2008.00364.
Li, S., Zhang, H., Ye, L., Guo, X., Fang, B. (2019). Mann: A multichannel attentive neural network for legal judgment prediction. IEEE Access, 7, 151144-151155.
Liu, Y., Zhou, Y., Wen, S., Tang, C. (2014). A strategy on selecting performance metrics for classifier evaluation. International Journal of Mobile Computing and Multimedia Communications (IJMCMC), 6(4): 20-35.
Mucherino, A., Papajorgji, P. J., Pardalos, P. M., Mucherino, A., Papajorgji, P. J., Pardalos, P. M. (2009). K-nearest neighbor classification. Data mining in agriculture, 83-106.
Nineesha, P., Deepalakshmi, P. (2022). Automated Techniques on Indian Legal documents: A Review. In 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT), IEEE, pp. 172-178.
Patel, H. H., Prajapati, P. (2018). Study and analysis of decision tree based classification algorithms. International Journal of Computer Sciences and Engineering, 6(10): 74-78.
Sert, M. F., Yıldırım, E., Haşlak, İ. (2022). Using Artificial Intelligence to Predict Decisions of The Turkish Constitutional Court. Social Science Computer Review, 40(6):1416-1435.
Turan, T. (2023). “Açıklanabilir Yapay Zeka İle Hukuki Metin Analizi”, (Phd Thesis). https://tez.yok.gov.tr/UlusalTezMerkezi/tezSorguSonucYeni.jsp
Tagarelli, A., Simeri, A. (2022). Unsupervised law article mining based on deep pre-trained language representation models with application to the Italian civil code. Artificial Intelligence and Law, 30(3): 417-473.
Turan, T., Küçüksille, E., Alagöz, N. K. (2023). Prediction of Turkish Constitutional Court Decisions with Explainable Artificial Intelligence. Bilge International Journal of Science and Technology Research, 7(2): 128-141.
Vujović, Ž. (2021). Classification model evaluation metrics. International Journal of Advanced Computer Science and Applications, 12(6): 599-606.
Wang, Z., Wang, B., Duan, X., Wu, D., Wang, S., Hu, G., Liu, T. (2019). IFlyLegal: a Chinese legal system for consultation, law searching, and document analysis. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, pp. 97-102.
Yang, L., Shami, A. (2020). On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415: 295-316.
Yun-Tao, Z., Ling, G., Yong-cheng, W. (2005). An improved TF-IDF approach for text classification. Journal of Zhejiang University-Science, 6(1): 49-55.
Zhang, D., Zhang, H., Wang, L., Cui, J., Zheng, W. (2022). Recognition of Chinese legal elements based on transfer learning and semantic relevance. Wireless Communications and Mobile Computing
Zhong, H., Xiao, C., Tu, C., Zhang, T., Liu, Z., Sun, M. (2020). How does NLP benefit legal system: A summary of legal artificial intelligence, arXiv preprint arXiv:2004.12158

Türkiye'de hukuki metin sınıflandırması: Boşanma ve imar kararlarına makine öğrenmesi yaklaşımı

Yıl 2024, Cilt: 6 Sayı: 2, 53 - 63, 20.12.2024

Tülay Turan , Ecir Uğur Küçüksille

https://doi.org/10.70669/ijedt.1491511

Öz

Son yıllarda artan hukuki veri hacmi, verimli yönetim ve kullanım için Yapay Zeka (AI) tekniklerinin entegrasyonunu gerektirmektedir. Kritik zorluklar arasında hukuki metinlerin belirli alanlara veya konulara göre sınıflandırılması yer almaktadır; bu, hukuki araştırma ve uygulamanın ilerletilmesi açısından çok önemlidir. Bu makale, uluslararası hukuk metinlerindeki sınıflandırma çalışmalarına kıyasla daha önce yeterince araştırılmamış bir alan olan Türk mahkeme kararlarını kategorik olarak sınıflandırmayı amaçlamaktadır. Çalışma, hukuki metinleri belirli alanlara ayırarak, Türk hukuki kararlarında yönlendirmeye yönelik yapay zeka destekli çözümlerin geliştirilmesine önemli ölçüde katkıda bulunmayı ve böylece hukuk sisteminin verimliliğini ve erişilebilirliğini artırmayı amaçlamaktadır. Çalışmada ilk olarak boşanma ve imar davalarından oluşan bir veri seti oluşturulmuştur. Daha sonra davaları en etkin şekilde sınıflandıracak algoritmayı belirlemek için K-En Yakın Komşu (KNN), Destek Vektör Makineleri (SVM), Karar Ağaçları (DT) ve Rastgele Ormanlar (RF) algoritmaları ile temel modeller kurulmuştur. Temel Model performanslarını arttırmak için her bir model için hiperparametre optimizasyonu gerçekleştirilmiştir. Bu süreç, 10 katlı çapraz doğrulama yöntemi ile desteklenmiştir. Optimizasyon sonucunda elde edilen hiperparametre değerleri ile iyileştirilmiş modeller kurulmuştur. Karşılaştırmalı analiz sonucunda, SVM modeli hukuki metinlerin sınıflandırılmasında %90 gibi etkileyici bir doğruluk oranına sahip olmuştur. Bu sonuç, Türkiye'deki hukuki metinlerin sınıflandırılmasında önemli bir başarıyı elde ederek, akıllı hukuk sistemlerinin gelişimine önemli katkılar sağlayacaktır.

Anahtar Kelimeler

Hukuki Metin Sınıflandırması , Türk Mahkeme Kararları , Makine Öğrenmesi Algoritmaları , Hiperparametre Optimizasyonu , SVM

Kaynakça

Alarie, B., Niblett, A., Yoon, A. H. (2018). How artificial intelligence will affect the practice of law. University of Toronto Law Journal, 68(1): 106-124.
Aletras, N., Tsarapatsanis, D., Preoţiuc-Pietro, D., Lampos, V. (2016). Predicting judicial decisions of the European Court of Human Rights: A natural language processing perspective. PeerJ computer science, 2, e93.
Alhakeem, Z. M., Jebur, Y. M., Henedy, S. N., Imran, H., Bernardo, L. F., Hussein, H. M. (2022). Prediction of ecofriendly concrete compressive strength using gradient boosting regression tree combined with GridSearchCV hyperparameter-optimization techniques. Materials, 15(21): 7432.
Ali, J., Khan, R., Ahmad, N., Maqsood, I. (2012). Random forests and decision trees. International Journal of Computer Science Issues (IJCSI), 9(5): 272.
Ashley, K. D., Brüninghaus, S. (2009). Automatically classifying case texts and predicting outcomes. Artificial Intelligence and Law, 17:125-165.
Awad, M., Khanna, R., Awad, M., Khanna, R. (2015). Support vector machines for classification. Efficient learning machines: Theories, concepts, and applications for engineers and system designers, 39-66.
Aydemir, E. (2023). Estimation of Turkish Constitutional Court Decisions in Terms of Admissibility with NLP. In 2023 IV International Conference on Neural Networks and Neurotechnologies (NeuroNT), IEEE, pp. 17-20.
Bafna, P., Pramod, D., Vaidya, A. (2016). Document clustering: TF-IDF approach. In 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), IEEE, pp. 61-66.
Benedetto, I., Sportelli, G., Bertoldo, S., Tarasconi, F., Cagliero, L., Giacalone, G. (2023). On the use of Pretrained Language Models for Legal Italian Document Classification. Procedia Computer Science, 225: 2244-2253.
Boella, G., Di Caro, L., Humphreys, L. (2011). Using classification to support legal knowledge engineers in the Eunomos legal document management system. In Fifth international workshop on Juris-informatics (JURISIN).
Breiman, L. (2001). Random forests. Machine learning, 45: 5-32.
Chalkidis, I., Androutsopoulos, I., Aletras, N. (2019). Neural legal judgment prediction in English. arXiv preprint arXiv:1906.02059.
Charbuty, B., Abdulazeez, A. (2021). Classification based on decision tree algorithm for machine learning. Journal of Applied Science and Technology Trends, 2(01): 20-28.
Chen, H., Wu, L., Chen, J., Lu, W., Ding, J. (2022). A comparative study of automated legal text classification using random forests and deep learning. Information Processing & Management, 59(2), 102798.
Cui, J., Shen, X., Wen, S. (2023). A survey on legal judgment prediction: Datasets, metrics, models and challenges. IEEE Access.
Cutler, D. R., Edwards Jr, T. C., Beard, K. H., Cutler, A., Hess, K. T., Gibson, J., Lawler, J. J. (2007). Random forests for classification in ecology. Ecology, 88(11):2783-2792.
Eliot, L. (2020). Legal judgment prediction (ljp) amid the advent of autonomous ai legal reasoning. arXiv preprint arXiv:2009.14620.
Fatourechi, M., Ward, R. K., Mason, S. G., Huggins, J., Schlögl, A., Birch, G. E. (2008). Comparison of evaluation metrics in classification applications with imbalanced datasets. In 2008 seventh international conference on machine learning and applications, IEEE, pp. 777-782.
Feurer, M., Hutter, F. (2019). Hyperparameter optimization. Automated machine learning: Methods, systems, challenges, 3-33.
Görentaş, M. B., Uçkan, T., Arli, N. B. (2023). Uyuşmazlık Mahkemesi Kararlarının Makine Öğrenmesi Yöntemleri ile Sınıflandırılması. Yüzüncü Yıl Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 28(3): 947-961.
Gunn, S. R. (1998). Support vector machines for classification and regression. ISIS technical report, 14(1): 5-16.
Gupta, B., Rawat, A., Jain, A., Arora, A., Dhami, N. (2017). Analysis of various decision tree algorithms for classification in data mining. International Journal of Computer Applications, 163(8): 15-19.
Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems and their applications, 13(4): 18-28.
Hirschberg, J., Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245): 261-266.
Hossin, M., Sulaiman, M. N. (2015). A review on evaluation metrics for data classification evaluations. International journal of data mining & knowledge management process, 5(2): 1.
Kalia, A., Kumar, N., Namdev, N. (2022). Classifying case facts and predicting legal decisions of the indian central information commission: a natural language processing approach. In Advances in Deep Learning, Artificial Intelligence and Robotics: Proceedings of the 2nd International Conference on Deep Learning, Artificial Intelligence and Robotics, (ICDLAIR), pp. 35-45.
Kartini, D., Nugrahadi, D. T., Farmadi, A. (2021). Hyperparameter tuning using GridsearchCV on the comparison of the activation function of the ELM method to the classification of pneumonia in toddlers. In 2021 4th International Conference of Computer and Informatics Engineering (IC2IE), IEEE., pp. 390-395.
Kaur, A., Bozic, B. (2019). Convolutional Neural Network-based Automatic Prediction of Judgments of the European Court of Human Rights. In AICS, pp. 458-469.
Laaksonen, J., Oja, E. (1996). Classification with learning k-nearest neighbors. In Proceedings of international conference on neural networks (ICNN’96) IEEE, pp. 1480-1483.
Lei, M., Ge, J., Li, Z., Li, C., Zhou, Y., Zhou, X., Luo, B. (2017). Automatically classify chinese judgment documents utilizing machine learning algorithms. In Database Systems for Advanced Applications: DASFAA 2017 International Workshops: BDMS, BDQM, SeCoP, and DMMOOC, Suzhou, China, March 27-30, 2017, Proceedings 22, pp. 3-17.
Li, Q., Peng, H., Li, J., Xia, C., Yang, R., Sun, L., He, L. (2020). A survey on text classification: From shallow to deep learning. arXiv preprint arXiv:2008.00364.
Li, S., Zhang, H., Ye, L., Guo, X., Fang, B. (2019). Mann: A multichannel attentive neural network for legal judgment prediction. IEEE Access, 7, 151144-151155.
Liu, Y., Zhou, Y., Wen, S., Tang, C. (2014). A strategy on selecting performance metrics for classifier evaluation. International Journal of Mobile Computing and Multimedia Communications (IJMCMC), 6(4): 20-35.
Mucherino, A., Papajorgji, P. J., Pardalos, P. M., Mucherino, A., Papajorgji, P. J., Pardalos, P. M. (2009). K-nearest neighbor classification. Data mining in agriculture, 83-106.
Nineesha, P., Deepalakshmi, P. (2022). Automated Techniques on Indian Legal documents: A Review. In 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT), IEEE, pp. 172-178.
Patel, H. H., Prajapati, P. (2018). Study and analysis of decision tree based classification algorithms. International Journal of Computer Sciences and Engineering, 6(10): 74-78.
Sert, M. F., Yıldırım, E., Haşlak, İ. (2022). Using Artificial Intelligence to Predict Decisions of The Turkish Constitutional Court. Social Science Computer Review, 40(6):1416-1435.
Turan, T. (2023). “Açıklanabilir Yapay Zeka İle Hukuki Metin Analizi”, (Phd Thesis). https://tez.yok.gov.tr/UlusalTezMerkezi/tezSorguSonucYeni.jsp
Tagarelli, A., Simeri, A. (2022). Unsupervised law article mining based on deep pre-trained language representation models with application to the Italian civil code. Artificial Intelligence and Law, 30(3): 417-473.
Turan, T., Küçüksille, E., Alagöz, N. K. (2023). Prediction of Turkish Constitutional Court Decisions with Explainable Artificial Intelligence. Bilge International Journal of Science and Technology Research, 7(2): 128-141.
Vujović, Ž. (2021). Classification model evaluation metrics. International Journal of Advanced Computer Science and Applications, 12(6): 599-606.
Wang, Z., Wang, B., Duan, X., Wu, D., Wang, S., Hu, G., Liu, T. (2019). IFlyLegal: a Chinese legal system for consultation, law searching, and document analysis. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, pp. 97-102.
Yang, L., Shami, A. (2020). On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415: 295-316.
Yun-Tao, Z., Ling, G., Yong-cheng, W. (2005). An improved TF-IDF approach for text classification. Journal of Zhejiang University-Science, 6(1): 49-55.
Zhang, D., Zhang, H., Wang, L., Cui, J., Zheng, W. (2022). Recognition of Chinese legal elements based on transfer learning and semantic relevance. Wireless Communications and Mobile Computing
Zhong, H., Xiao, C., Tu, C., Zhang, T., Liu, Z., Sun, M. (2020). How does NLP benefit legal system: A summary of legal artificial intelligence, arXiv preprint arXiv:2004.12158

Toplam 46 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Elektrik Mühendisliği (Diğer)
Bölüm	Araştırma Makalesi
Yazarlar	Tülay Turan 0000-0002-0888-0343 Ecir Uğur Küçüksille 0000-0002-3293-9878
Erken Görünüm Tarihi	9 Ağustos 2024
Yayımlanma Tarihi	20 Aralık 2024
Gönderilme Tarihi	28 Mayıs 2024
Kabul Tarihi	1 Ağustos 2024
Yayımlandığı Sayı	Yıl 2024 Cilt: 6 Sayı: 2

Kaynak Göster

APA	Turan, T., & Küçüksille, E. U. (2024). Legal text classification in Turkey: A machine learning approach to divorce and zoning decisions. Uluslararası Mühendislik Tasarım ve Teknoloji Dergisi, 6(2), 53-63. https://doi.org/10.70669/ijedt.1491511

Kapak Resmi İndir

Makale Dosyaları

Tam Metin