A LANGUAGE MODELING APPROACH TO TURKISH TEXT RETRIEVAL
Abstract
We used Lemur Toolkit, an open source toolkit designed for Information Retrieval research, for our automated indexing and retrieval experiments on a TREC-like test collection for Turkish language. We investigate effectiveness of three retrieval models Lemur supports, especially Language modeling approach to Information Retrieval, combined with language specific preprocessing techniques. Our experiments show that language specific preprocessing significantly improves retrieval performance for all retrieval models. Also Language Modeling approach is the best performing retrieval model when language specific preprocessing applied.
Keywords
References
- Altingovde, I.S., Ozcan, R., Ocalan, H.C., Can, F. and Ulusoy, O. (2007). Large-scale cluster-based retrieval experiments on Turkish texts. Paper presented at the Proceedings international ACM SIGIR conference on Research and development in information retrieval. 30th annual
- And, P.O., Ogilvie, P. and Callan, J. (2002). Experiments Using the Lemur Toolkit. Paper presented at the in Proceedings of the Tenth Text Retrieval Conference (TREC-10).
- Arslan, A. and Yilmazel, O. (2008). 19-22 Oct. 2008). A comparison of Relational Databases and information retrieval libraries on Turkish text retrieval. Paper presented at the Natural Language Processing and Knowledge Engineering, 2008. Conference on. International
- Buckley, C. and Voorhees, E.M. (2004). Retrieval evaluation with incomplete information. Paper presented at the Proceedings international ACM SIGIR conference on Research and development in information retrieval. 27th annual
- Can, F., Kocberber, S., Balcik, E., Kaynak, C., Ocalan, H.C., and Vursavas, O.M. (2008). Information retrieval on Turkish texts. J. Am. Soc. Inf. Sci. Technol. 59(3), 407- 421.
- Cover, T.M. and Thomas, J.A. (1991). Elemets of information theory. Wiley-Interscience.
- Eryigit, G. and Adali, E. (2004). An Affix Stripping Morphological Analyzer For Turkish. Paper presented at the IASTED International Conference on Artificial Intelligence and Applications, Innsbruck, Austria.
- Harman, D. (1993). Overview of the first TREC conference. Paper presented at the Proceedings international ACM SIGIR conference on Research and development in information retrieval. 16th annual
Details
Primary Language
English
Subjects
Engineering
Journal Section
Research Article
Authors
Publication Date
November 29, 2010
Submission Date
November 29, 2010
Acceptance Date
-
Published in Issue
Year 2010 Volume: 11 Number: 2