EN
Named Entity Recognition in Turkish Bank Documents
Abstract
Named Entity Recognition (NER) is the process of automatically recognizing entity names such as person, organization, and date in a document. In this study, we focus on bank documents written in Turkish and propose a Conditional Random Fields (CRF) model to extract named entities. The main contribution of this study is twofold: (i) we propose domain-specific features to extract entity names such as law, regulation, and reference which frequently appear in bank documents; and (ii) we contribute to NER research in Turkish document which is not as mature as other languages such as English and German. Experimental results based on 10-fold cross validation conducted on 551 real-life, anonymized bank documents show the proposed CRF-NER model achieves 0.962 micro average F1 score. More specifically, F1 score for the identification of law names is 0.979, regulation name is 0.850, and article no is 0.850.
Keywords
Supporting Institution
TÜBİTAK
Project Number
5190074
References
- [1] Nagy I., Berend G., Vincze V., 2011. Noun compound and named entity recognition and their usability in keyphrase extraction. International Conference Recent Advances in Natural Language Processing, Hissar, Bulgaria, 12-14 September.
- [2] Rodrigo A., Perez-Iglesias J., Penas A., Garrido G., Araujo L., 2013. Answering questions about European legislation. Expert Systems with Applications, 40(15), pp. 5811-5816.
- [3] Cao T. H., Tang T. M., Chau C. K., 2012. Text clustering with named entities: a model, experimentation and realization. In Data mining: Foundations and intelligent paradigms, Springer, Berlin, Heidelberg.
- [4] Hassel M., 2003. Exploitation of named entities in automatic text summarization for Swedish. 14th Nordic Conference on Computational Linguistics, Reykjavik, Iceland, 30-31 May.
- [5] Grishman R., Sundheim B. M., 1996. Message Understanding Conference – 6: A brief history. The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, 5-9 August.
- [6] Black W. J., Rinaldi F., Mowatt D., 1998. FACILE: Description of the NE System Used for MUC-7. 7th Message Understanding Conference, Fairfax, Virginia, 29 April – 1 May.
- [7] Aone C., Halverson L., Hampton T., Ramos-Santacruz M., 1998. SRA: Description of the IE2 system used for MUC-7. 7th Message Understanding Conference, Fairfax, Virginia, 29 April – 1 May.
- [8] Nadeau D., Turney P. D., Matwin S., 2006. Unsupervised named-entity recognition: Generating gazetteers and resolving ambiguity. 19th Canadian Conference on Artificial Intelligence, Quebec, Canada, 7-9 June.
Details
Primary Language
English
Subjects
Computer Software
Journal Section
Research Article
Publication Date
November 30, 2021
Submission Date
January 31, 2021
Acceptance Date
April 13, 2021
Published in Issue
Year 2021 Volume: 4 Number: 2
APA
Kabasakal, O., & Mutlu, A. (2021). Named Entity Recognition in Turkish Bank Documents. Kocaeli Journal of Science and Engineering, 4(2), 86-92. https://doi.org/10.34088/kojose.871873
AMA
1.Kabasakal O, Mutlu A. Named Entity Recognition in Turkish Bank Documents. KOJOSE. 2021;4(2):86-92. doi:10.34088/kojose.871873
Chicago
Kabasakal, Osman, and Alev Mutlu. 2021. “Named Entity Recognition in Turkish Bank Documents”. Kocaeli Journal of Science and Engineering 4 (2): 86-92. https://doi.org/10.34088/kojose.871873.
EndNote
Kabasakal O, Mutlu A (November 1, 2021) Named Entity Recognition in Turkish Bank Documents. Kocaeli Journal of Science and Engineering 4 2 86–92.
IEEE
[1]O. Kabasakal and A. Mutlu, “Named Entity Recognition in Turkish Bank Documents”, KOJOSE, vol. 4, no. 2, pp. 86–92, Nov. 2021, doi: 10.34088/kojose.871873.
ISNAD
Kabasakal, Osman - Mutlu, Alev. “Named Entity Recognition in Turkish Bank Documents”. Kocaeli Journal of Science and Engineering 4/2 (November 1, 2021): 86-92. https://doi.org/10.34088/kojose.871873.
JAMA
1.Kabasakal O, Mutlu A. Named Entity Recognition in Turkish Bank Documents. KOJOSE. 2021;4:86–92.
MLA
Kabasakal, Osman, and Alev Mutlu. “Named Entity Recognition in Turkish Bank Documents”. Kocaeli Journal of Science and Engineering, vol. 4, no. 2, Nov. 2021, pp. 86-92, doi:10.34088/kojose.871873.
Vancouver
1.Osman Kabasakal, Alev Mutlu. Named Entity Recognition in Turkish Bank Documents. KOJOSE. 2021 Nov. 1;4(2):86-92. doi:10.34088/kojose.871873