Year 2017, Volume 18 , Issue 2, Pages 346 - 359 2017-06-30

STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM

Senem KUMOVA METİN [1] , Bahar KARAOĞLAN [2]


In a wide group of languages, the stop words, which have only grammatical roles and not contributing to information content, may be simply exposed by their relatively higher occurrence frequencies. But, in agglutinative or inflectional languages, a stop word may be observed in several different surface forms due to the inflection producing noise.

In this study, some of the well-known binary classification methods are employed to overcome the inflectional noise problem in stop word detection. The experiments are conducted on corpora of an agglutinative language, Turkish, in which the amount of inflection is high and a non-agglutinative language, English, in which the inflection is lower for stop words. The evaluations demonstrated that in Turkish corpus, the classification methods improve stop word detection with respect to frequency-based method. On the other hand, the classification methods applied on English corpora showed no improvement in the performance of stop word detection.

stop word, content word, binary classification, tf-idf
Subjects Engineering
Journal Section Articles
Authors

Author: Senem KUMOVA METİN

Author: Bahar KARAOĞLAN

Dates

Publication Date : June 30, 2017

Bibtex @ { aubtda322136, journal = {Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering}, issn = {1302-3160}, eissn = {2146-0205}, address = {}, publisher = {Eskisehir Technical University}, year = {2017}, volume = {18}, pages = {346 - 359}, doi = {10.18038/aubtda.322136}, title = {STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM}, key = {cite}, author = {KUMOVA METİN, Senem and KARAOĞLAN, Bahar} }
APA KUMOVA METİN, S , KARAOĞLAN, B . (2017). STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM. Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering , 18 (2) , 346-359 . DOI: 10.18038/aubtda.322136
MLA KUMOVA METİN, S , KARAOĞLAN, B . "STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM". Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering 18 (2017 ): 346-359 <https://dergipark.org.tr/en/pub/aubtda/issue/29641/322136>
Chicago KUMOVA METİN, S , KARAOĞLAN, B . "STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM". Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering 18 (2017 ): 346-359
RIS TY - JOUR T1 - STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM AU - Senem KUMOVA METİN , Bahar KARAOĞLAN Y1 - 2017 PY - 2017 N1 - doi: 10.18038/aubtda.322136 DO - 10.18038/aubtda.322136 T2 - Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering JF - Journal JO - JOR SP - 346 EP - 359 VL - 18 IS - 2 SN - 1302-3160-2146-0205 M3 - doi: 10.18038/aubtda.322136 UR - https://doi.org/10.18038/aubtda.322136 Y2 - 2020 ER -
EndNote %0 Anadolu Üniversitesi Bilim Ve Teknoloji Dergisi A - Uygulamalı Bilimler ve Mühendislik STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM %A Senem KUMOVA METİN , Bahar KARAOĞLAN %T STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM %D 2017 %J Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering %P 1302-3160-2146-0205 %V 18 %N 2 %R doi: 10.18038/aubtda.322136 %U 10.18038/aubtda.322136
ISNAD KUMOVA METİN, Senem , KARAOĞLAN, Bahar . "STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM". Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering 18 / 2 (June 2017): 346-359 . https://doi.org/10.18038/aubtda.322136
AMA KUMOVA METİN S , KARAOĞLAN B . STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM. AUBTD-A. 2017; 18(2): 346-359.
Vancouver KUMOVA METİN S , KARAOĞLAN B . STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM. Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering. 2017; 18(2): 359-346.


Bengali Stop Word and Phrase Detection Mechanism
Arabian Journal for Science and Engineering
Rakib Ul Haque
https://doi.org/10.1007/s13369-020-04388-8