Year 2017, Volume 18, Issue 2, Pages 346 - 359 2017-06-30

STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM

Senem Kumova Metin [1] , Bahar Karaoğlan [2]

254 531

In a wide group of languages, the stop words, which have only grammatical roles and not contributing to information content, may be simply exposed by their relatively higher occurrence frequencies. But, in agglutinative or inflectional languages, a stop word may be observed in several different surface forms due to the inflection producing noise.

In this study, some of the well-known binary classification methods are employed to overcome the inflectional noise problem in stop word detection. The experiments are conducted on corpora of an agglutinative language, Turkish, in which the amount of inflection is high and a non-agglutinative language, English, in which the inflection is lower for stop words. The evaluations demonstrated that in Turkish corpus, the classification methods improve stop word detection with respect to frequency-based method. On the other hand, the classification methods applied on English corpora showed no improvement in the performance of stop word detection.

stop word, content word, binary classification, tf-idf
Subjects Engineering
Journal Section Articles
Authors

Author: Senem Kumova Metin

Author: Bahar Karaoğlan

Bibtex @ { aubtda322136, journal = {ANADOLU UNIVERSITY JOURNAL OF SCIENCE AND TECHNOLOGY A - Applied Sciences and Engineering}, issn = {1302-3160}, eissn = {2146-0205}, address = {Eskişehir Teknik Üniversitesi}, year = {2017}, volume = {18}, pages = {346 - 359}, doi = {10.18038/aubtda.322136}, title = {STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM}, key = {cite}, author = {Kumova Metin, Senem and Karaoğlan, Bahar} }
APA Kumova Metin, S , Karaoğlan, B . (2017). STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM. ANADOLU UNIVERSITY JOURNAL OF SCIENCE AND TECHNOLOGY A - Applied Sciences and Engineering, 18 (2), 346-359. DOI: 10.18038/aubtda.322136
MLA Kumova Metin, S , Karaoğlan, B . "STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM". ANADOLU UNIVERSITY JOURNAL OF SCIENCE AND TECHNOLOGY A - Applied Sciences and Engineering 18 (2017): 346-359 <http://dergipark.org.tr/aubtda/issue/29641/322136>
Chicago Kumova Metin, S , Karaoğlan, B . "STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM". ANADOLU UNIVERSITY JOURNAL OF SCIENCE AND TECHNOLOGY A - Applied Sciences and Engineering 18 (2017): 346-359
RIS TY - JOUR T1 - STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM AU - Senem Kumova Metin , Bahar Karaoğlan Y1 - 2017 PY - 2017 N1 - doi: 10.18038/aubtda.322136 DO - 10.18038/aubtda.322136 T2 - ANADOLU UNIVERSITY JOURNAL OF SCIENCE AND TECHNOLOGY A - Applied Sciences and Engineering JF - Journal JO - JOR SP - 346 EP - 359 VL - 18 IS - 2 SN - 1302-3160-2146-0205 M3 - doi: 10.18038/aubtda.322136 UR - https://doi.org/10.18038/aubtda.322136 Y2 - 2019 ER -
EndNote %0 ANADOLU UNIVERSITY JOURNAL OF SCIENCE AND TECHNOLOGY A - Applied Sciences and Engineering STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM %A Senem Kumova Metin , Bahar Karaoğlan %T STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM %D 2017 %J ANADOLU UNIVERSITY JOURNAL OF SCIENCE AND TECHNOLOGY A - Applied Sciences and Engineering %P 1302-3160-2146-0205 %V 18 %N 2 %R doi: 10.18038/aubtda.322136 %U 10.18038/aubtda.322136
ISNAD Kumova Metin, Senem , Karaoğlan, Bahar . "STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM". ANADOLU UNIVERSITY JOURNAL OF SCIENCE AND TECHNOLOGY A - Applied Sciences and Engineering 18 / 2 (June 2017): 346-359. https://doi.org/10.18038/aubtda.322136