Classification Performance of the Different Stemming Methods
Abstract
Saving textual data and accessing them in many fields have become one of the basic problems nowadays. The usage of these data effectively is directly related to the development of storage and access tools that will be used. Therefore, software programs using different methods have been developed. One of the points that need to be taken into account is data classifying. Because using raw data in these classifying processes is harmful, finding the stem of the texts is useful. In this study, the successes of two different stemming algorithms in the text classifying are comparatively examined.
Keywords
References
- Kantardzic M., 2003, Data Mining:Concepts, Models, Methods, and Algorithms, IEEE Pres, Wiley Interscience Publications.
- Saracoğlu R., 2007, Searching For Similar Documents Using Fuzzy Clustering, PhD Thesis, Graduate School of Natural and Applied Sciences, Selçuk University, Konya.
- Yıldırım P.(*), Uludağ M.(**), Görür A.(*), 2008, Hastane Bilgi Sistemlerinde Veri Madenciliği, Akademik Bilişim Konferansları’08, (*) Çankaya Üniversitesi, Bilgisayar Mühendisliği Bölümü, Ankara. (**) European Bioinformatics Institute, Cambridge, UK.
- Porter, M.F., 1980, An Algorithm For Suffix Stripping, Program, 14(3):130-137.
- Jurafsky, D. and Martin, J., 2000, Speech and Language Processing, Prentice Hall, New Jersey.
- Kesgin F., 2007, Topıc Detectıon System For Turkısh Texts, Master Thesis, Graduate School of Natural and Applied Sciences, Istanbul Technical University,Istanbul.
- Joachims, T., 2002, Learning to classify text using support vector machines, Kluwer Academic Publishers, Boston.
- Jackson, P., Moulinier, I., 2002, Natural language processing for online applications: text retrieval, extraction, and categorization, Amsterdam.
Details
Primary Language
English
Subjects
-
Journal Section
-
Publication Date
June 29, 2015
Submission Date
February 10, 2015
Acceptance Date
-
Published in Issue
Year 2015 Volume: 3 Number: 3