Finding Stem is a complicated and important issue for agglutinative languages like Turkish where theoretically infinite number of surface forms can be obtained from a single lexeme. Both analytical and statistical approaches have been tried for stemming Turkish words. Two main problems apparent with these approaches are the involvement of a dictionary which enforces the assumption of closed vocabulary and the disambiguation of the actual stem among the numerous candidates. Here, we present a method that exploits the simple fact that nouns and verbs have different suffix patterns. Statistical methods are used for stripping off the suffixes. Based on the suffix pattern PoS is determined which then enables the decision for the stem boundary. Thus, the major contribution of the study is the avoiding the disambiguation problem and not using a regular dictionary for stemming. The performance rate for proposed method on golden standard PoS tagged Turkish corpus is 93.83%.
Bölüm | Araştırma Makalesi |
---|---|
Yazarlar | |
Tarihler |
Yayımlanma Tarihi : 14 Temmuz 2016 |
Bibtex | @ { aubtda257971,
journal = {Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering},
issn = {1302-3160},
eissn = {2146-0205},
address = {},
publisher = {Eskişehir Teknik Üniversitesi},
year = {2016},
volume = {17},
pages = {401 - 412},
doi = {10.18038/btda.31812},
title = {A hybrid Statistical Approach to Stemming in Turkish: An Agglutinative Language},
key = {cite},
author = {Kışla, Tarık and Karaoğlan, Bahar}
} |
APA | Kışla, T , Karaoğlan, B . (2016). A hybrid Statistical Approach to Stemming in Turkish: An Agglutinative Language . Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering , 17 (2) , 401-412 . DOI: 10.18038/btda.31812 |
MLA | Kışla, T , Karaoğlan, B . "A hybrid Statistical Approach to Stemming in Turkish: An Agglutinative Language" . Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering 17 (2016 ): 401-412 <https://dergipark.org.tr/tr/pub/aubtda/issue/24338/257971> |
Chicago | Kışla, T , Karaoğlan, B . "A hybrid Statistical Approach to Stemming in Turkish: An Agglutinative Language". Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering 17 (2016 ): 401-412 |
RIS | TY - JOUR T1 - A hybrid Statistical Approach to Stemming in Turkish: An Agglutinative Language AU - Tarık Kışla , Bahar Karaoğlan Y1 - 2016 PY - 2016 N1 - doi: 10.18038/btda.31812 DO - 10.18038/btda.31812 T2 - Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering JF - Journal JO - JOR SP - 401 EP - 412 VL - 17 IS - 2 SN - 1302-3160-2146-0205 M3 - doi: 10.18038/btda.31812 UR - https://doi.org/10.18038/btda.31812 Y2 - 2021 ER - |
EndNote | %0 Anadolu Üniversitesi Bilim Ve Teknoloji Dergisi A - Uygulamalı Bilimler ve Mühendislik A hybrid Statistical Approach to Stemming in Turkish: An Agglutinative Language %A Tarık Kışla , Bahar Karaoğlan %T A hybrid Statistical Approach to Stemming in Turkish: An Agglutinative Language %D 2016 %J Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering %P 1302-3160-2146-0205 %V 17 %N 2 %R doi: 10.18038/btda.31812 %U 10.18038/btda.31812 |
ISNAD | Kışla, Tarık , Karaoğlan, Bahar . "A hybrid Statistical Approach to Stemming in Turkish: An Agglutinative Language". Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering 17 / 2 (Temmuz 2016): 401-412 . https://doi.org/10.18038/btda.31812 |
AMA | Kışla T , Karaoğlan B . A hybrid Statistical Approach to Stemming in Turkish: An Agglutinative Language. AUBTD-A. 2016; 17(2): 401-412. |
Vancouver | Kışla T , Karaoğlan B . A hybrid Statistical Approach to Stemming in Turkish: An Agglutinative Language. Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering. 2016; 17(2): 401-412. |
IEEE | T. Kışla ve B. Karaoğlan , "A hybrid Statistical Approach to Stemming in Turkish: An Agglutinative Language", Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering, c. 17, sayı. 2, ss. 401-412, Tem. 2016, doi:10.18038/btda.31812 |