TY - JOUR T1 - A LANGUAGE MODELING APPROACH TO TURKISH TEXT RETRIEVAL TT - TÜRKÇE METİN GERİ GETİRIMİNDE DİL MODELLEME YAKLAŞIMI AU - Yilmazel, Ozgur PY - 2010 DA - December JF - Anadolu University Journal of Science and Technology A - Applied Sciences and Engineering JO - AUJST-A PB - Eskisehir Technical University WT - DergiPark SN - 1302-3160 SP - 163 EP - 172 VL - 11 IS - 2 LA - en AB - We used Lemur Toolkit, an open source toolkit designed for Information Retrieval research, for our automated indexing and retrieval experiments on a TREC-like test collection for Turkish language. We investigate effectiveness of three retrieval models Lemur supports, especially Language modeling approach to Information Retrieval, combined with language specific preprocessing techniques. Our experiments show that language specific preprocessing significantly improves retrieval performance for all retrieval models. Also Language Modeling approach is the best performing retrieval model when language specific preprocessing applied. KW - Turkish information retrieval KW - Lemur toolkit KW - Language modeling N2 - Bu çalışmada, bilgi erişimi araştırması için tasarlanmış açık kaynak kodlu bir araç olan Lemur kullanılarak, Türkçe dili için hazırlanmış TREC benzeri bir derlem üzerinde otomatik indeksleme ve geri getirme deneyleri gerçekleştirildi. Bilgi erişiminde dil modelleme yaklaşımı başta olmak üzere Lemur tarafından desteklenen üç geri getirme modeli ve dile özgü ön işleme teknikleri araştırıldı. Deneylerimiz, dile özgü ön işleme tekniklerinin tüm geri getirim modelleri için geri getirme performansını artırdığını gösterdi. Ayrıca Türkçe dili için en iyi performans dil modelleme yaklaşımından elde edildi. CR - Altingovde, I.S., Ozcan, R., Ocalan, H.C., Can, F. and Ulusoy, O. (2007). Large-scale cluster-based retrieval experiments on Turkish texts. Paper presented at the Proceedings international ACM SIGIR conference on Research and development in information retrieval. 30th annual CR - And, P.O., Ogilvie, P. and Callan, J. (2002). Experiments Using the Lemur Toolkit. Paper presented at the in Proceedings of the Tenth Text Retrieval Conference (TREC-10). CR - Arslan, A. and Yilmazel, O. (2008). 19-22 Oct. 2008). A comparison of Relational Databases and information retrieval libraries on Turkish text retrieval. Paper presented at the Natural Language Processing and Knowledge Engineering, 2008. Conference on. International CR - Buckley, C. and Voorhees, E.M. (2004). Retrieval evaluation with incomplete information. Paper presented at the Proceedings international ACM SIGIR conference on Research and development in information retrieval. 27th annual CR - Can, F., Kocberber, S., Balcik, E., Kaynak, C., Ocalan, H.C., and Vursavas, O.M. (2008). Information retrieval on Turkish texts. J. Am. Soc. Inf. Sci. Technol. 59(3), 407- 421. CR - Cover, T.M. and Thomas, J.A. (1991). Elemets of information theory. Wiley-Interscience. CR - Eryigit, G. and Adali, E. (2004). An Affix Stripping Morphological Analyzer For Turkish. Paper presented at the IASTED International Conference on Artificial Intelligence and Applications, Innsbruck, Austria. CR - Harman, D. (1993). Overview of the first TREC conference. Paper presented at the Proceedings international ACM SIGIR conference on Research and development in information retrieval. 16th annual CR - Jones, K.S. (1988). A statistical interpretation of term specificity and its application in retrieval Document retrieval systems (pp. 132-142): Taylor Graham Publishing. CR - Manning, C., Raghavan, P. and Schutze, H. (2008). Introduction to Information Retrieval: Cambridge University Press. CR - Ponte, J.M. and Croft, W.B. (1998). A language modeling approach to information retrieval. Paper presented at the Proceedings international ACM SIGIR conference on Research and development in information retrieval. 21st annual CR - Robertson, S., Walker, S., Hancock-Beaulieu, M., Gull, A. and Lau, M. (1992). Okapi at TREC-3. Paper presented at the Text REtrieval Conference. CR - Salton, G., Wong, A. and Yang, C.S. (1975). A vector space model for automatic indexing. Commun. ACM 18(11), 613- 620. CR - Walker, S., Robertson, S.E., Boughanem, M., Jones, G.J.F. and Jones, S. (1998). Okapi at TREC-6 - Automatic ad hoc, VLC, routing, filtering and QSDR. CR - Zhai, C. Notes on the Lemur TFIDF model, from 1.0/tfidf.ps CR - Zhai, C. and Lafferty, J. (2001). A study of smoothing methods for language models applied to Ad Hoc information retrieval. Paper presented at the Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. UR - https://dergipark.org.tr/en/pub/aubtda/article/42110 L1 - https://dergipark.org.tr/en/download/article-file/35587 ER -