Emsal Hukuk Dokümanlarının Otomatik Belirlenmesi

Meltem Cetiner; Yusuf Sinan Akgül

doi:10.29130/dubited.1012386

Araştırma Makalesi

Emsal Hukuk Dokümanlarının Otomatik Belirlenmesi

Yıl 2021, Cilt: 9 Sayı: 6 - ICAIAME 2021, 83 - 94, 31.12.2021

Meltem Cetiner , Yusuf Sinan Akgül

https://doi.org/10.29130/dubited.1012386

Öz

Doğal dil işleme çalışmaları, yapay zekada olduğu gibi veri artışına bağlı olarak hız kazanan alanlardan biridir. Bu çalışmada, ele alınan hukuk dokümanlarının da zaman içerisinde örnekleri artmaktadır. Bir davada emsal olarak gösterilen başka bir davanın tespiti, dava seyrini tamamıyla değiştirmesi nedeniyle oldukça önemlidir. Emsal dava tespitini ele alan bu çalışmada, Ulusal Yargı Ağı Projesi (UYAP) bilgi bankası üzerinden bir veri seti oluşturulmuştur. Davaların incelenmesi ile elde edilen dava şablonları kullanılarak, farklı kısımların girdi ve çıktı sağladığı metinden metin elde edilmesini sağlayan LSTM modeli ile üç farklı sistem oluşturulmuştur. Sistemlerden sağlanan metin çıktıları, farklı BERT modellerinden elde edilen temsil vektörlerinden, FAISS kütüphanesi yardımıyla hızlı bir şekilde test verileri için en yakın dokümanlar elde edilmiştir. 5 farklı dava tipi kategorisindeki test hukuk dokümanlarının, kategori kümelerindeki dokümanlar arasından en benzer 10 dokümanı iki avukat tarafından ayrı ayrı işaretlenmiştir. Sistemlerden elde edilen ve avukatların işaretlediği sonuçlar karşılaştırılmış, benzerlikler örneklerle açıklanarak paylaşılmıştır.

Anahtar Kelimeler

Emsal davalar, Diziden diziye sinir ağları, Metin benzerliği

Destekleyen Kurum

IDEA Teknoloji Çözümleri

Teşekkür

UYAP verilerinin hazırlanmasında bizlere yardımcı olan Enes Almahdi’ye teşekkür ederiz.

Kaynakça

[1] A. H. Tan, “Text mining: The state of the art and the challenges,” Proceedings of the Pakdd 1999 Workshop on Knowledge Discovery from Advanced Databases, 1999, ss. 65-70.
[2] S. Yanatma, Euronews. (2019,11 Şubat) [Çevrimiçi] Erişim: https://tr.euronews.com/2019/05/29/turkiye-de-savciliklara-gelen-dosya-sayisi-son-10-yilda-yuzde-53-artti.
[3] M. Kızrak Ayyüce (2021, 18 Ağustos) [Çevrimiçi] Erişim: https://www.istanbulbarosu.org.tr/files/docs/yapayzekacagindahukuk.pdf
[4] S, Semmler ve R. Zeeve. “Artificial intelligence: Application today and implications tomorrow,” Duke L. & Tech. Rev. c. 16, s. 1/3, ss. 85-99, 2017. [5] Sulea, Octavia-Maria, et al. “Exploring the use of text classification in the legal domain,” arXiv preprint arXiv:1710.09306, 2017.
[6] A. Farzindar, “Atefeh Farzindar and Guy Lapalme,'LetSum, an automatic Legal Text Summarizing system in T. Gordon (ed.), Legal Knowledge and Information Systems. Jurix 2004: The Seventeenth Annual Conference. Amsterdam: IOS Press, 2004, pp. 11-18.” Legal Knowledge and Information Systems: JURIX 2004, the Seventeenth Annual Conference. vol. 120, IOS Press, 2004.
[7] Adalet Bakanlığı, (2021, 03 Eylül) [Çevrimiçi] Erişim: http://emsal.uyap.gov.tr/BilgiBankasiIstemciWeb/
[8] A. Aizawa, “An information-theoretic perspective of tf–idf measures,” Information Processing & Management, c. 39, s. 1, 45-65, 2003.
[9] J. A. Bullinaria ve J. P. Levy. “Extracting semantic representations from word co-occurrence statistics: A computational study,” Behavior research methods, c. 39, s. 3, ss. 510-526, 2007.
[10] K. W. Church, “Word2Vec,” Natural Language Engineering, c. 23, s. 1, ss. 155-162, 2017, doi:10.1017/S1351324916000334. [11] J. Devlin ve diğ. “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
[12] M, E. Peters,, ve diğ. “Deep contextualized word representations,” arXiv preprint arXiv:1802.05365, 2018.
[13] A. Radford, ve diğ. “Language models are unsupervised multitask learners,” 2019.
[14] T. Wolf, ve diğ. “Huggingface's transformers: State-of-the-art natural language processing,” arXiv preprint arXiv:1910.03771, 2019.
[15] Kyunghyun Cho, ve diğ. “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.
[16] C. Baziotis, N. Pelekis, ve C.Doulkeridis, “Datastories at semeval-2017 task 4: Deep lstm with attention for message-level and topic-based sentiment analysis,” Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017), 2017.
[17] dbmdz, (2020, 1 Mart). “dbmdz/bert-base-turkish-128k-cased · hugging face,” [Çevrimiçi]. Erişim: https://huggingface.co/dbmdz/bert-base-turkish-128k-cased
[18] distilbert, (2020, 1 Mart). “distilbert-base-nli-stsb-mean-tokens · hugging face,” [Çevrimiçi]. Erişim: https://huggingface.co/sentence-transformers/distilbert-base-nli-stsb-mean-tokens
[19] A. Paszke, ve diğ., “Pytorch: An imperative style, high-performance deep learning library,” Advances in Neural Information Processing Systems, 32, ss. 8026-8037 https://arxiv.org/abs/1912.01703v1, 2019.
[20] J. Johnson, M. Douze, ve H. Jégou. “Billion-scale similarity search with GPUs.” IEEE Transactions on Big Data, 2019.

Automatic Precedent Legal Document Detection

Yıl 2021, Cilt: 9 Sayı: 6 - ICAIAME 2021, 83 - 94, 31.12.2021

Meltem Cetiner , Yusuf Sinan Akgül

https://doi.org/10.29130/dubited.1012386

Öz

The natural language processing studies are one of the study fields in artificial intelligence that gain momentum increasing data. The legal documents discussed in this study are also increasing in time. It is very important to show another case which is sentenced as expected called precedent document, as it could completely change the direction of the case. In this study, which deals with the detection of precedent cases, a data set was created via the National Judiciary Informatics System (UYAP) data bank. Using the case templates obtained by studying on the documents, three different systems are created with the sequence to sequence LSTM model, which allows to generate text from the text that different parts provides input and output. After generating the output of the systems,text representation vectors are created using different bert models. The created vectors are used to detect the most similar documents via FAISS library. The test legal documents are selected within 5 different case category clusters. Two lawyers helped us to define the most similar 10 documents within the category clusters of the defined 5 test legal documents. The most similar 10 documents output of the systems are also generated. The results of all systems are shared with the comparison by the annotated results explaining with the examples.

Anahtar Kelimeler

Precedent documents, Sequence to sequence neural networks, Text similarity

Kaynakça

[1] A. H. Tan, “Text mining: The state of the art and the challenges,” Proceedings of the Pakdd 1999 Workshop on Knowledge Discovery from Advanced Databases, 1999, ss. 65-70.
[2] S. Yanatma, Euronews. (2019,11 Şubat) [Çevrimiçi] Erişim: https://tr.euronews.com/2019/05/29/turkiye-de-savciliklara-gelen-dosya-sayisi-son-10-yilda-yuzde-53-artti.
[3] M. Kızrak Ayyüce (2021, 18 Ağustos) [Çevrimiçi] Erişim: https://www.istanbulbarosu.org.tr/files/docs/yapayzekacagindahukuk.pdf
[4] S, Semmler ve R. Zeeve. “Artificial intelligence: Application today and implications tomorrow,” Duke L. & Tech. Rev. c. 16, s. 1/3, ss. 85-99, 2017. [5] Sulea, Octavia-Maria, et al. “Exploring the use of text classification in the legal domain,” arXiv preprint arXiv:1710.09306, 2017.
[6] A. Farzindar, “Atefeh Farzindar and Guy Lapalme,'LetSum, an automatic Legal Text Summarizing system in T. Gordon (ed.), Legal Knowledge and Information Systems. Jurix 2004: The Seventeenth Annual Conference. Amsterdam: IOS Press, 2004, pp. 11-18.” Legal Knowledge and Information Systems: JURIX 2004, the Seventeenth Annual Conference. vol. 120, IOS Press, 2004.
[7] Adalet Bakanlığı, (2021, 03 Eylül) [Çevrimiçi] Erişim: http://emsal.uyap.gov.tr/BilgiBankasiIstemciWeb/
[8] A. Aizawa, “An information-theoretic perspective of tf–idf measures,” Information Processing & Management, c. 39, s. 1, 45-65, 2003.
[9] J. A. Bullinaria ve J. P. Levy. “Extracting semantic representations from word co-occurrence statistics: A computational study,” Behavior research methods, c. 39, s. 3, ss. 510-526, 2007.
[10] K. W. Church, “Word2Vec,” Natural Language Engineering, c. 23, s. 1, ss. 155-162, 2017, doi:10.1017/S1351324916000334. [11] J. Devlin ve diğ. “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
[12] M, E. Peters,, ve diğ. “Deep contextualized word representations,” arXiv preprint arXiv:1802.05365, 2018.
[13] A. Radford, ve diğ. “Language models are unsupervised multitask learners,” 2019.
[14] T. Wolf, ve diğ. “Huggingface's transformers: State-of-the-art natural language processing,” arXiv preprint arXiv:1910.03771, 2019.
[15] Kyunghyun Cho, ve diğ. “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.
[16] C. Baziotis, N. Pelekis, ve C.Doulkeridis, “Datastories at semeval-2017 task 4: Deep lstm with attention for message-level and topic-based sentiment analysis,” Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017), 2017.
[17] dbmdz, (2020, 1 Mart). “dbmdz/bert-base-turkish-128k-cased · hugging face,” [Çevrimiçi]. Erişim: https://huggingface.co/dbmdz/bert-base-turkish-128k-cased
[18] distilbert, (2020, 1 Mart). “distilbert-base-nli-stsb-mean-tokens · hugging face,” [Çevrimiçi]. Erişim: https://huggingface.co/sentence-transformers/distilbert-base-nli-stsb-mean-tokens
[19] A. Paszke, ve diğ., “Pytorch: An imperative style, high-performance deep learning library,” Advances in Neural Information Processing Systems, 32, ss. 8026-8037 https://arxiv.org/abs/1912.01703v1, 2019.
[20] J. Johnson, M. Douze, ve H. Jégou. “Billion-scale similarity search with GPUs.” IEEE Transactions on Big Data, 2019.

Toplam 18 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Meltem Cetiner 0000-0001-5026-0642 Yusuf Sinan Akgül 0000-0001-8501-4812
Yayımlanma Tarihi	31 Aralık 2021
Yayımlandığı Sayı	Yıl 2021 Cilt: 9 Sayı: 6 - ICAIAME 2021

Kaynak Göster

APA	Cetiner, M., & Akgül, Y. S. (2021). Emsal Hukuk Dokümanlarının Otomatik Belirlenmesi. Duzce University Journal of Science and Technology, 9(6), 83-94. https://doi.org/10.29130/dubited.1012386
AMA	Cetiner M, Akgül YS. Emsal Hukuk Dokümanlarının Otomatik Belirlenmesi. DÜBİTED. Aralık 2021;9(6):83-94. doi:10.29130/dubited.1012386
Chicago	Cetiner, Meltem, ve Yusuf Sinan Akgül. “Emsal Hukuk Dokümanlarının Otomatik Belirlenmesi”. Duzce University Journal of Science and Technology 9, sy. 6 (Aralık 2021): 83-94. https://doi.org/10.29130/dubited.1012386.
EndNote	Cetiner M, Akgül YS (01 Aralık 2021) Emsal Hukuk Dokümanlarının Otomatik Belirlenmesi. Duzce University Journal of Science and Technology 9 6 83–94.
IEEE	M. Cetiner ve Y. S. Akgül, “Emsal Hukuk Dokümanlarının Otomatik Belirlenmesi”, DÜBİTED, c. 9, sy. 6, ss. 83–94, 2021, doi: 10.29130/dubited.1012386.
ISNAD	Cetiner, Meltem - Akgül, Yusuf Sinan. “Emsal Hukuk Dokümanlarının Otomatik Belirlenmesi”. Duzce University Journal of Science and Technology 9/6 (Aralık 2021), 83-94. https://doi.org/10.29130/dubited.1012386.
JAMA	Cetiner M, Akgül YS. Emsal Hukuk Dokümanlarının Otomatik Belirlenmesi. DÜBİTED. 2021;9:83–94.
MLA	Cetiner, Meltem ve Yusuf Sinan Akgül. “Emsal Hukuk Dokümanlarının Otomatik Belirlenmesi”. Duzce University Journal of Science and Technology, c. 9, sy. 6, 2021, ss. 83-94, doi:10.29130/dubited.1012386.
Vancouver	Cetiner M, Akgül YS. Emsal Hukuk Dokümanlarının Otomatik Belirlenmesi. DÜBİTED. 2021;9(6):83-94.

Kapak Resmi İndir

Makale Dosyaları

Tam Metin