Araştırma Makalesi
PDF EndNote BibTex RIS Kaynak Göster

Incremental Refinement of Relevance Rankings: Introducing a New Method Supported with Pennant Retrieval

Yıl 2022, Cilt 36, Sayı 2, 169 - 203, 30.06.2022
https://doi.org/10.24146/tk.1062751

Öz

Purpose: Relevance ranking algorithms rank retrieved documents based on the degrees of topical similarity (relevance) between search queries and documents. This paper aims to introduce a new relevance ranking method combining a probabilistic topic modeling algorithm with the “pennant retrieval” method using citation data. Data and Method: We applied this method to the iSearch corpus consisting of c. 435,000 physics papers. We first ran the topic modeling algorithm on titles and summaries of all papers for 65 search queries and obtained the relevance ranking lists. We then used the pennant retrieval to fuse the citation data with the existing relevance rankings, thereby incrementally refining the results. The outcome produced better relevance rankings with papers covering various aspects of the topic searched as well as the more marginal ones. The Maximal Marginal Relevance (MMR) algorithm was used to evaluate the retrieval performance of the proposed method by finding out its effect on relevance ranking algorithms that we used. Findings: Findings suggest that the terms used in different contexts in the papers might sometimes be overlooked by the topic modeling algorithm. Yet, the fusion of citation data to relevance ranking lists provides additional contextual information, thereby further enriching the results with diverse (interdisciplinary) papers of higher relevance. Moreover, results can easily be re-ranked and personalized. Implications: We argue that once it is tested on dynamic corpora for computational load, robustness, replicability, and scalability, the proposed method can in time be used in both local and international information systems such as TR-Dizin, Web of Science, and Scopus. Originality: The proposed method is, as far as we know, the first one that shows that relevance rankings produced with a topic modeling algorithm can be incrementally refined using pennant retrieval techniques based on citation data.

Kaynakça

  • Abramo, G., D’Angelo, C. A. ve Zhang, L. (2018). A comparison of two approaches for measuring interdisciplinary research output: The disciplinary diversity of authors vs the disciplinary diversity of the reference list. Journal of Informetrics, 12(4), 1182-1193. https://doi.org/10.1016/j.joi.2018.09.001
  • Adomavicius, G. ve Kwon, Y. (2011). Improving aggregate recommendation diversity using ranking-based techniques. IEEE Transactions on Knowledge and Data Engineering, 24(5), 896-911. https://doi.org/10.1109/TKDE.2011.15
  • ADS Team (2008). SAO/NASA ADS Abstract Service Stopword List. https://adsabs.harvard.edu/abs_doc/stopwords.html
  • Akbulut, M. (2016). Atıf klasiklerinin etkisinin ve ilgililik sıralamalarının pennant diyagramları ile analizi [Yayımlanmamış yüksek lisans tezi]. Hacettepe Üniversitesi. https://hdl.handle.net/11655/3529
  • Akbulut, M., Tonta, Y. ve White, H. D. (2020). Related records retrieval and pennant retrieval: An exploratory case study. Scientometrics, 122(2), 957-987. https://doi.org/10.1007/s11192-019-03303-9
  • Arun, R., Suresh, V., Madhavan, C. V. ve Murthy, M. N. (2010). On finding the natural number of topics with latent dirichlet allocation: Some observations. Pacific-Asia Conference on Knowledge Discovery and Data Mining içinde (s. 391-402). Springer. https://doi.org/10.1007/978-3-642-13657-3_43
  • Baeza-Yates, R. ve Ribeiro-Neto, B. (1999). Modern information retrieval. ACM Press.
  • Ballester, O. ve Penner, O. (2022). Robustness, replicability and scalability in topic modelling. Journal of Informetrics, 16(1). https://doi.org/10.1016/j.joi.2021.101224
  • Bayer, D. ve Michael, S. (2019). Exploring the daschle collection using text mining. arXiv. https://arxiv.org/pdf/1904.12623.pdf
  • Beel, J. ve Gipp, B. (2009). Google Scholar’s ranking algorithm: An introductory overview. B. Larsen ve J. Leta (Yay. haz.). Proceedings of the 12th International Conference on Scientometrics and Informetrics içinde (s. 230-241). International Society for Scientometrics and Informetrics. https://www.issi-society.org/proceedings/issi_2009/ISSI2009-proc-vol1_Aug2009_batch2-paper-1.pdf
  • Beel, J., Gipp, B., Langer, S. ve Breitinger, C. (2016). Research-paper recommender systems: A literature survey. International Journal on Digital Libraries, 17(4): 305-338.
  • Belter, C. W. (2017). A relevance ranking method for citation-based search results. Scientometrics, 112(2), 731-746. https://doi.org/10.1007/s11192-017-2406-y
  • Bichteler, J. ve Eaton III, E.A. (1980). The combined use of bibliographic coupling and cocitation for document retrieval. Journal of the American Society for Information Science, 31(4): 278–282.
  • Blei, D. M. ve Lafferty, J. D. (2009). Topic models. A. Srivastava ve M. Sahami (Yay. haz.). Text Mining: Classification, Clustering and Applications içinde (s. 71-94). CRC Press, Taylor & Francis. http://www.cs.columbia.edu/~blei/papers/BleiLafferty2009.pdf
  • Blei, D. M., Ng, A. Y. ve Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993-1022. https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf?TB_iframe=true&width=370.8&height=658.8
  • Bornmann, L., Haunschild, R. ve Mutz, R. (2021). Growth rates of modern science: A latent piecewise growth curve approach to model publication numbers from established and new literature databases. Humanities and Social Sciences Communications, 8(1), 1-15. https://doi.org/10.1057/s41599-021-00903-w
  • Boyd-Graber, J. ve Blei, D. M. (2010). Syntactic topic models. arXiv. https://arxiv.org/pdf/1002.4665.pdf
  • Bradley, K. ve Smyth, B. (2001). Improving recommendation diversity. D. O'Donoghue (Yay. haz.) Proceedings of the Twelfth Irish Conference on Artificial Intelligence and Cognitive Science içinde (s. 141-152). NUIM Department of Computer Science. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.8.5232&rep=rep1&type=pdf
  • Cambria, E. ve White, B. (2014). Jumping NLP curves: A review of natural language processing research. IEEE Computational Intelligence Magazine, 9(2), 48-57. https://doi.org/10.1109/MCI.2014.2307227
  • Cao, J., Xia, T., Li, J., Zhang, Y. ve Tang, S. (2009). A density-based method for adaptive LDA model selection. Neurocomputing, 72(7-9), 1775-1781. https://doi.org/10.1016/j.neucom.2008.06.011
  • Carbonell, J. ve Goldstein, J. (1998). The use of MMR, diversity-based reranking for reordering documents and producing summaries. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval içinde (s. 335-336). Association for Computing Machinery. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.188.3982&rep=rep1&type=pdf
  • Carevic, Z. ve Mayr, P. (2014). Recommender systems using pennant diagrams in digital libraries. arXiv. https://arxiv.org/pdf/1407.7276v1.pdf
  • Carevic, Z. ve Schaer, P. (2014). On the connection between citation-based and topical relevance ranking: results of a pretest using iSearch. Proceedings of the First Workshop on Bibliometric-enhanced Information Retrieval co-located with 36th European Conference on Information Retrieval (ECIR 2014) içinde (s. 37-44). Springer-Verlag. https://ceur-ws.org/Vol-1143/paper5.pdf
  • Carroll, M. (2018). Changes in media coverage of GCSEs from 1988 to 2017. Cambridge. https://www.cambridgeassessment.org.uk/Images/504456-changes-in-media-coverage-of-gcses-from-1988-to-2017.pdf
  • Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. L. ve Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. Advances in Neural Information Processing Systems içinde (s. 288-296). MIT Press. https://proceedings.neurips.cc/paper/2009/file/f92586a25bb3145facd64ab20fd554ff-Paper.pdf
  • Chen, M. ve Décary, M. (2018). A Cognitive-based semantic approach to deep content analysis in search engines. 2018 IEEE 12th International Conference on Semantic Computing (ICSC) içinde (s. 131-139). IEEE. https://doi.ieeecomputersociety.org/10.1109/ICSC.2018.00027
  • Chen, Z. ve Liu, B. (2014). Mining topics in documents: Standing on the shoulders of big data. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining içinde (s. 1116-1125). ACM. https://dl.acm.org/doi/pdf/10.1145/2623330.2623622
  • Cooper, W. S. (1988). Getting beyond boole. Information Processing & Management, 24(3), 243-248. https://doi.org/10.1016/0306-4573(88)90091-X
  • Croft W. B. (2002). Combining approaches to information retrieval. W.B. Croft (Yay. haz.). Advances in Information Retrieval. The Information Retrieval Series, vol 7. içinde (s. 1-35). Springer, https://doi.org/10.1007/0-306-47019-5_1
  • Danilov, M. (2005). Experimental review on pentaquarks. arXiv. https://arxiv.org/abs/hep-ex/0509012
  • Deveaud, R., SanJuan, E. ve Bellot, P. (2014). Accurate and effective latent concept modeling for ad hoc information retrieval. Document Numérique, 17(1), 61-84. https://doi.org/10.3166/dn.17.1.61-84
  • Ekinci, E. ve İlhan Omurca, S. (2020). Concept-LDA: Incorporating Babelfy into LDA for aspect extraction. Journal of Information Science, 46(3), 406-418. https://doi.org/10.1177/0165551519845854
  • Ganguly, D. ve Jones, G. J. (2018). A non-parametric topical relevance model. Information Retrieval Journal, 21(5), 449-479. https://doi.org/10.1007/s10791-018-9329-y
  • Giustolisi, O., Ridolfi, L. ve Simone, A. (2020). Embedding the intrinsic relevance of vertices in network analysis: the case of centrality metrics. Scientific Reports, 10(1). https://doi.org/10.1038/s41598-020-60151-x
  • Gläser, J., Glänzel, W. ve Scharnhorst, A. (2017). Same data—different results? Towards a comparative approach to the identification of thematic structures in science. Scientometrics, 111(2), 981-998. https://doi.org/10.1007/s11192-017-2296-z
  • Griffiths, T. L. ve Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(1), 5228-5235. https://doi.org/10.1073/pnas.0307752101
  • Guillemette, J., Simms, B., Zhou, D. ve Mills, S. (2017). Applying latent dirichlet allocation to yelp reviews. https://people.math.carleton.ca/~smills/2017-18/STAT4601-5703/Research%20Projects/2018%20Submissions/GuillemetteSimmsZhouD/Applying%20LDA.pdf
  • Guo, J., Fan, Y., Ai, Q. ve Croft, W. B. (2016). A deep relevance matching model for ad-hoc retrieval. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management içinde (s. 55-64). ACM. https://doi.org/10.1145/2983323.2983769
  • Guo, Z., Zhang, Z. M., Zhu, S., Chi, Y. ve Gong, Y. (2013). A two-level topic model towards knowledge discovery from citation networks. IEEE Transactions on Knowledge and Data Engineering, 26(4), 780-794. https://doi.org/10.1109/TKDE.2013.56
  • Han, X. (2020). Evolution of research topics in LIS between 1996 and 2019: an analysis based on latent dirichlet allocation topic model. Scientometrics, 125(3), 2561-2595. https://doi.org/10.1007/s11192-020-03721-0
  • Hecking, T. ve Leydesdorff, L. (2018). Topic modelling of empirical text corpora: Validity, reliability, and reproducibility in comparison to semantic maps. arXiv. https://arxiv.org/pdf/1806.01045.pdf
  • Herlocker, J.L., Konstan, J. A., Terveen, L. G. ve Riedl, J. T. (2004) Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems, 22(1), 5-53. https://doi.org/10.1145/963770.963772
  • Holliger, T. S. (2018). Strategic sourcing via category management: Helping air force installation contracting agency eat one piece of the elephant [Yayımlanmamış yüksek lisans tezi]. Air Force Institute of Technology. https://apps.dtic.mil/sti/pdfs/AD1056353.pdf
  • Huang, L., Liu, H., He, J. ve Du, X. (2016). Finding latest influential research papers through modeling two views of citation links. F. Li, K. Shim, K. Zheng ve G. Liu (Yay. haz.) Web Technologies and Applications APWeb 2016 içinde (s. 555-566). Springer, Cham. https://doi.org/10.1007/978-3-319-45814-4_45
  • Huang, X., Chen, C., Peng, C., Wu, X., Fu, L. ve Wang, X. (2018). Topic-sensitive influential paper discovery in citation network. D. Phung, V. Tseng, G. Webb, B. Ho, M. Ganji ve L. Rashidi (Yay. haz.). Advances in Knowledge Discovery and Data Mining içinde (s. 16-28). Springer, Cham. https://doi.org/10.1007/978-3-319-93037-4_2
  • Jin, R., Valizadegan, H. ve Li, H. (2008). Ranking refinement and its application to information retrieval. Proceedings of the 17th International Conference on World Wide Web içinde (s. 397-406). ACM. http://doi.org/10.1145/1367497.1367552
  • Ke, Q., Ferrara, E., Radicchi, F. ve Flammini, A. (2015). Defining and identifying Sleeping Beauties in science. Proceedings of the National Academy of Sciences, 112(24), 7426-7431. https://doi.org/10.1073/pnas.1424329112
  • Kessler, M. M. (1963). Bibliographic coupling between scientific papers. American Documentation, 14(1): 10-25
  • Knoth, P., Anastasiou, L., Charalampous, A., Cancellieri, M., Pearce, S., Pontika, N. ve Bayer, V. (2017). Towards effective research recommender systems for repositories. arXiv. https://arxiv.org/abs/1705.00578
  • Kucuktunc, O. ve Ferhatosmanoglu, H. (2011). λ-diverse nearest neighbors browsing for multidimensional data. IEEE Transactions on Knowledge and Data Engineering, 25(3), 481-493. https://doi.org/10.1109/TKDE.2011.251
  • Küçüktunç, O., Saule, E., Kaya, K. ve Çatalyürek, Ü. V. (2015). Diversifying citation recommendations. ACM Transactions on Intelligent Systems and Technology, 5(4), 1-21. https://doi.org/10.1145/2668106
  • Lei, M., Wang, J., Chen, B. ve Li, X. (2001). Improved relevance ranking in WebGather. Journal of Computer Science and Technology, 16(5), 410-417. https://doi.org/10.1007/bf02948958
  • Leydesdorff, L. ve Nerghes, A. (2017). Co‐word maps and topic modeling: A comparison using small and medium‐sized corpora (N< 1,000). Journal of the Association for Information Science and Technology, 68(4), 1024-1035. https://doi.org/10.1002/asi.23740
  • Li, C., Feng, H. ve Rijke, M. D. (2020). Cascading hybrid bandits: online learning to rank for relevance and diversity. Fourteenth ACM Conference on Recommender Systems içinde (s. 33-42). ACM. https://doi.org/10.1145/3383313.3412245
  • Li, W. ve McCallum, A. (2006). Pachinko allocation: DAG-structured mixture models of topic correlations. Proceedings of the 23rd International Conference on Machine Learning içinde (s. 577-584). Springer. https://doi.org/10.1145/1143844.1143917
  • Li, Y., He, J. ve Liu, H. (2017). Topic analysis and influential paper discovery on scientific publications. 2017 14th Web Information Systems and Applications Conference (WISA) içinde (s. 68-73). IEEE. https://doi.org/10.1109/WISA.2017.69
  • Liu, X., Wang, G. ve Zakirul Alam Bhuiyan, M. (2022). Re‐ranking with multiple objective optimization in recommender system. Transactions on Emerging Telecommunications Technologies, 33(1): e4398 https://doi.org/10.1002/ett.4398
  • Liu, X. Z. ve Fang, H. (2020). A comparison among citation-based journal indicators and their relative changes with time. Journal of Informetrics, 14(1), 1-17. https://doi.org/10.1016/j.joi.2020.101007
  • Lykke, M., Larsen, B., Lund, H. ve Ingwersen, P. (2010). Developing a test collection for the evaluation of integrated search. European Conference on Information Retrieval içinde (s. 627-630). Springer. https://doi.org/10.1007/978-3-642-12275-0_63
  • Ma, Z., Liu, Y., Yang, Z., Yang, J. ve Li, K. (2022). A parameter-free approach to lossless summarization of fully dynamic graphs. Information Sciences, 589, 376-394. https://doi.org/10.1016/j.ins.2021.12.116
  • Mahajan, M., Beeferman, D. ve Huang, X. D. (1999). Improved topic-dependent language modeling using information retrieval techniques. 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings içinde (s. 541-544). IEEE. https://doi.org/10.1109/ICASSP.1999.758182
  • Manning, C. ve Schütze, H. (2000). Foundations of statistical natural language processing. MIT Press. https://ics.upjs.sk/~pero/web/documents/pillar/Manning_Schuetze_Statistical NLP.pdf
  • Maron, M. E. ve Kuhns, J. L. (1960). On relevance, probabilistic indexing and information retrieval. Journal of the ACM, 7(3), 216-244. https://doi.org/10.1145/321033.321035
  • Marujo, L., Ribeiro, R., Gershman, A., De Matos, D.M., Neto, J.P. ve Carbonell, J. (2017). Event-based summarization using a centrality-as-relevance model. Knowledge and Information Systems, 50, 945–968. https://doi.org/10.1007/s10115-016-0966-4
  • Mayr, P. ve Mutschke, P. (2013). Bibliometric-enhanced retrieval models for big scholarly information systems. 2013 IEEE International Conference on Big Data içinde (s. 5-8). IEEE. https://doi.org/10.1109/BigData.2013.6691762
  • McNee, S. M., Riedl, J. ve Konstan, J. A. (2006). Being accurate is not enough: how accuracy metrics have hurt recommender systems. CHI'06 extended abstracts on human factors in computing systems içinde (s. 1097-1101). https://doi.org/10.1145/1125451.1125659
  • Meng, W., Yu, C. ve Liu, K. L. (2002). Building efficient and effective metasearch engines. ACM Computing Surveys (CSUR), 34(1), 48-89. https://doi.org/10.1145/505282.505284
  • Mizzaro, S. (1997). Relevance: The whole history. Journal of the American Society for Information Science, 48, 810-832. https://doi.org/10.1002/(SICI)1097-4571(199709)48:9<810::AID-ASI6>3.0.CO;2-U
  • Nguyen, T. ve Do, P. (2018). CitationLDA++ an extension of LDA for discovering topics in document network. Proceedings of the Ninth International Symposium on Information and Communication Technology içinde (s. 31-37). ACM. https://doi.org/10.1145/3287921.3287930
  • Nikita, M. (2020, 20 Nisan). Select number of topics for LDA. https://cran.r-project.org/web/packages/ldatuning/vignettes/topics.html
  • Nolasco, D. ve Oliveira, J. (2016). Detecting knowledge innovation through automatic topic labeling on scholar data. 2016 49th Hawaii International Conference on System Sciences (HICSS) içinde (s. 358-367). IEEE. https://doi.org/10.1109/HICSS.2016.51
  • Pao, M. L. (1993). Term and citation retrieval: A field study. Information Processing & Management. 29(1), 95-112. https://doi.org/10.1016/0306-4573(93)90026-A
  • Ponweiser, M. (2012). Latent dirichlet allocation in R. [Yayımlanmamış yüksek lisans tezi]. Viyana Üniversitesi. https://epub.wu.ac.at/id/eprint/3558
  • Rafols, I., Leydesdorff, L., O’Hare, A., Nightingale, P. ve Stirling, A. (2012). How journal rankings can suppress interdisciplinary research: A comparison between innovation studies and business & management. Research Policy, 41(7), 1262-1282. https://doi.org/10.1016/j.respol.2012.03.015
  • Ribeiro, R., ve de Matos, D.M. (2011). Revisiting Centrality-as-relevance: support sets and similarity as geometric proximity. Journal of Artificial Intelligence Research, 42, 275-308. https://doi.org/10.1613/jair.3387
  • Robertson, S. E. (1977). The probability ranking principle in IR. Journal of Documentation, 33(4), 294-304. https://doi.org/10.1108/eb026647
  • Rüdiger, M. S., Antons, D. ve Salge, T. O. (2021). The explanatory power of citations: a new approach to unpacking impact in science. Scientometrics, 126, 9779-9809. https://doi.org/10.1007/s11192-021-04103-w
  • Salton, G., Yang, C. ve Wong, A. (1975). A vector space model for automatic indexing. Communications of the ACM, 18, 613-620. https://doi.org/10.1145/361219.361220
  • Samraj, B. (2005). An exploration of a genre set: Research article abstracts and introductions in two disciplines. English for Specific Purposes, 24(2), 141-156. https://doi.org/10.1016/j.esp.2002.10.001
  • Saracevic, T. (2021). Relevance: In search of a theoretical foundation. D. H. Sonnenwald (Yay. haz.), Theory Development in the Information Sciences içinde (s. 141-163). University of Texas Press. https://doi.org/10.7560/308240-011
  • Sperber, D. ve Wilson, D. (1995). Relevance: Communication and cognition. Blackwell. https://monoskop.org/images/e/e6/Sperber_Dan_Wilson_Deirdre_Relevance _Communica_and_Cognition_2nd_edition_1996.pdf
  • Swanson, D. R. (1986a). Subjective versus objective relevance in bibliographic retrieval systems. The Library Quarterly, 56(4), 389-398. https://doi.org/10.1086/601800
  • Swanson, D. R. (1986b). Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine, 30(1):7-18. https://doi.org/10.1353/pbm.1986.0087
  • Thara, D. K., PremaSudha, B. G. ve Xiong, F. (2019). Auto-detection of epileptic seizure events using deep neural network with different feature scaling techniques. Pattern Recognition Letters, 128, 544-550. https://doi.org/10.1016/j.patrec.2019.10.029
  • Thompson, P. (2007). Looking back: On relevance, probabilistic indexing and information retrieval. Information Processing & Management, 44(2), 963-970. https://doi.org/10.1016/j.ipm.2007.10.002
  • Tonta, Y. (1995). Bilgi erişim sistemleri. Türk Kütüphaneciliği, 9(3), 302-314. https://eprints.rclis.org/9571/
  • Tonta, Y. ve Akbulut, M. (2021). Uluslararası dergilerde yayımlanan Türkiye adresli makalelerin atıf etkisini artıran faktörler. Türk Kütüphaneciliği, 35(3), 388-409. https://doi.org/10.24146/tk.933159
  • Vergoulis, T., Chatzopoulos, S., Kanellos, I., Deligiannis, P., Tryfonopoulos, C. ve Dalamagas, T. (2019). BIP! finder: Facilitating scientific literature search by exploiting impact-based ranking. Proceedings of the 28th ACM International Conference on Information and Knowledge Management içinde (s. 2937-2940). ACM. https://doi.org/10.1145/3357384.3357850
  • Verma, M., Yılmaz, E. ve Craswell, N. (2016). On obtaining effort based judgements for information retrieval. Proceedings of the 9th ACM International Conference on Web Search and Data Mining içinde (s. 277-286). ACM. https://doi.org/10.1145/2835776.2835840
  • Wang, X., Zhai, C. ve Roth, D. (2013). Understanding evolution of research themes: a probabilistic generative model for citations. Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining içinde (s. 1115-1123). ACM. https://doi.org/10.1145/2487575.2487698
  • White, H. D. (2007a). Combining bibliometrics, information retrieval, and relevance theory. Part 1: First examples of a synthesis. Journal of the American Society for Information Science and Technology, 58, 536-559. https://doi.org/10.1002/asi.20543
  • White, H. D. (2007b). Combining bibliometrics, information retrieval, and relevance theory. Part 2: Some implications for information science. Journal of the American Society for Information Science and Technology, 58, 583-605. https://doi.org/10.1002/asi.20542
  • White, H. D. (2009). Pennants for Strindberg and Persson. Celebrating scholarly communication studies: A festschrift for Olle Persson at his 60th birthday. Special volume of the E-newsletter of the International Society for Scientometrics and Informetrics, 5, 71-83. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.168.2055&rep=rep1&type=pdf#page=73
  • White, H. D. (2010). Some new tests of relevance theory in information science. Scientometrics, 83, 653-667. https://doi.org/10.1007/s11192-009-0138-3
  • White, H. D. (2015). Co-cited author retrieval and relevance theory: examples from the humanities. Scientometrics, 102(3), 2275-2299. https://doi.org/10.1007/s11192-014-1483-4
  • White, H. D. (2016). Bag of works retrieval: TF*IDF weighting of co-cited works. Proceedings of the 3rd Workshop on Bibliometric-Enhanced Information Retrieval (BIR2016) içinde (s. 63-72). https://ceur-ws.org/Vol-1567/paper7.pdf
  • White, H. D. (2018). Bag of works retrieval: TF*IDF weighting of co-cited works with a seed. International Journal of Digital Libraries, 19, 139-149. https://doi.org/10.1007/s00799-017-0217-7
  • White, H. D. ve McCain, K. W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972-1995. Journal of the American Society for Information Science, 49(4): 327-355. https://doi.org/10.1002/(SICI)1097-4571(19980401)49:4%3C327::AID-ASI4%3E3.0.CO;2-4
  • Wilson, P. (1978). Some fundamental concepts of information retrieval. Drexel Library Quarterly, 14(2), 10-24.
  • Wilson, D. ve Sperber, D. (2002). Relevance theory. G. Ward ve L. Horn (Yay. haz.) Handbook of pragmatics içinde (s. 1-55). Blackwell. https://jeannicod.ccsd.cnrs.fr/ijn_00000101/document
  • Wu, H. C., Luk, R. W., Wong, K. F. ve Kwok, K. L. (2007). A retrospective study of a hybrid document-context based retrieval model. Information Processing & Management, 43(5), 1308-1331. https://doi.org/10.1016/j.ipm.2006.10.009
  • Wu, J., Son, G. ve Wang, S. (2020). A competency mining method based on Latent Dirichlet Allocation (LDA) model. Journal of Physics: Conference Series (Vol. 1682, No. 1, p. 012059) içinde. IOP Publishing. https://iopscience.iop.org/article/10.1088/1742-6596/1682/1/012059/meta
  • Xia, H., Li, J., Tang, J. ve Moens MF. (2012). Plink-LDA: Using link as prior information in topic modeling. S. Lee, Z. Peng, X. Zhou, Y. S. Moon, R. Unland ve J. Yoo (Yay. haz.) Database Systems for Advanced Applications içinde (s. 213-227). Springer. https://doi.org/10.1007/978-3-642-29038-1_17
  • Xie, X., Liang, Y., Li, X. ve Tan, W. (2019). CuLDA_CGS: Solving large-scale LDA problems on GPUs. Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming içinde (s. 435-436). ACM. https://doi.org/10.1145/3293883.3301496
  • Yang, H. T., Ju, J. H., Wong, Y. T., Shmulevich, I. ve Chiang, J. H. (2017). Literature-based discovery of new candidates for drug repurposing. Briefings in Bioinformatics, 18(3), 488-497. https://doi.org/10.1093/bib/bbw030
  • Yang, L., Ji, D. ve Leong, M. (2007). Document reranking by term distribution and maximal marginal relevance for Chinese information retrieval. Information Processing & Management, 43(2), 315-326. https://doi.org/10.1016/j.ipm.2006.07.011
  • Yılmaz, E., Verma, M., Craswell, N., Radlinski, F. ve Bailey, P. (2014). Relevance and effort: An analysis of document utility. Proceedings of the 23rd ACM International Conference on Information and Knowledge Management içinde (s. 91-100). ACM. https://doi.org/10.1145/2661829.2661953
  • Zarrinkalam, F. ve Kahani, M. (2012). A new metric for measuring relatedness of scientific papers based on non-textual features. Intelligent Information Management, 4(4), 99-107. https://www.scirp.org/pdf/IIM20120400001_98298896.pdf
  • Zhou, H. K., Yu, H. M. ve Hu, R. (2017). Topic discovery and evolution in scientific literature based on content and citations. Frontiers of Information Technology & Electronic Engineering, 18(10), 1511-1524. https://doi.org/10.1631/FITEE.1601125
  • Zou, L., Liu, X., Buntine, W. ve Liu, Y. (2021). Citation context-based topic models: discovering cited and citing topics from full text. Library Hi Tech, 39(4), 1063-1083. https://doi.org/10.1108/LHT-01-2021-0041

İlgi Sıralamalarının Artırımlı Olarak Geliştirilmesi: Pennant Erişimle Desteklenen Yeni Bir Yöntem Önerisi

Yıl 2022, Cilt 36, Sayı 2, 169 - 203, 30.06.2022
https://doi.org/10.24146/tk.1062751

Öz

Amaç: İlgi sıralaması algoritmaları erişilen belgeleri arama sorgularıyla belgeler arasındaki konusal benzerlik (ilgi) derecelerine göre sıralar. Bu çalışmanın amacı; bir olasılıksal konu modelleme algoritması ile atıf verilerine dayanan “pennant erişim”in birleşiminden oluşan yeni bir ilgi sıralaması yöntemi geliştirmektir. Veri Kaynakları ve Yöntem: Geliştirdiğimiz yöntemi yaklaşık 435 bin fizik makalesinden oluşan iSearch derlemi üzerinde uyguladık. Önce 65 sorgu için derlemdeki tüm makalelerin başlıkları ve özetleri üzerinde konu modelleme algoritmasını çalıştırarak ilgi sıralamalarını elde ettik. Daha sonra pennant erişim yöntemini uygulayarak elde ettiğimiz atıf bilgilerini mevcut ilgi sıralamalarını tümleştirmek (fusion) ve daha da geliştirmek için kullandık. Böylece hem aranan konunun farklı yönlerini kapsayan hem de konuyla marjinal ilgili olan makalelerden oluşan daha iyi ilgi sıralamaları elde ettik. Maksimum Marjinal İlgi (MMR, Maximum Marginal Relevance) algoritmasının farklı ilgi sıralamaları üzerindeki etkilerini ayrı ayrı inceleyerek önerdiğimiz yöntemin erişim performansını değerlendirdik. Bulgular: Bulgular konu modelleme algoritması ile elde edilen ilgi sıralamalarında makalelerin başlıklarında ve özetlerinde geçen bazı terimlerin bazen göz ardı edilebildiğini göstermektedir. Ama bu sıralamalar atıf verilerine dayanan pennant erişimle desteklendiğinde, kullanılan terimlerin bağlamları hakkında ek bilgiler elde edilmekte ve sonuçta ilgi düzeyleri daha yüksek ve çeşitli (interdisipliner) makaleler içeren daha zenginleştirilmiş ilgi sıralamaları oluşturulmaktadır. Dahası, erişim çıktıları araştırmacıların önceliklerine göre kolayca yeniden sıralanabilmektedir (kişiselleştirme). Sonuç: Önerdiğimiz yöntemde pennant erişim tekniklerini kullanarak mevcut ilgi sıralaması algoritmalarının artırımlı olarak iyileştirilmesi üzerinde odaklandık. Bu yöntemin hesaplama yükü, sağlamlık, tekrarlanabilirlik ve ölçeklenebilirlik açılarından dinamik derlemler üzerinde sınandıktan sonra zamanla TR-Dizin, Web of Science ve Scopus gibi hem yerel hem de uluslararası bilgi sistemlerinde de kullanılabileceği kanısındayız. Özgünlük: Bu araştırmada yeni bir ilgi sıralaması yöntemi önerilmektedir. Bildiğimiz kadarıyla bu çalışma, LDA konu modelleme algoritması ile elde edilen ilgi sıralamalarının atıf verilerine dayanan pennant erişim teknikleriyle artırımlı olarak geliştirilebileceğini gösteren ilk çalışmadır.

Kaynakça

  • Abramo, G., D’Angelo, C. A. ve Zhang, L. (2018). A comparison of two approaches for measuring interdisciplinary research output: The disciplinary diversity of authors vs the disciplinary diversity of the reference list. Journal of Informetrics, 12(4), 1182-1193. https://doi.org/10.1016/j.joi.2018.09.001
  • Adomavicius, G. ve Kwon, Y. (2011). Improving aggregate recommendation diversity using ranking-based techniques. IEEE Transactions on Knowledge and Data Engineering, 24(5), 896-911. https://doi.org/10.1109/TKDE.2011.15
  • ADS Team (2008). SAO/NASA ADS Abstract Service Stopword List. https://adsabs.harvard.edu/abs_doc/stopwords.html
  • Akbulut, M. (2016). Atıf klasiklerinin etkisinin ve ilgililik sıralamalarının pennant diyagramları ile analizi [Yayımlanmamış yüksek lisans tezi]. Hacettepe Üniversitesi. https://hdl.handle.net/11655/3529
  • Akbulut, M., Tonta, Y. ve White, H. D. (2020). Related records retrieval and pennant retrieval: An exploratory case study. Scientometrics, 122(2), 957-987. https://doi.org/10.1007/s11192-019-03303-9
  • Arun, R., Suresh, V., Madhavan, C. V. ve Murthy, M. N. (2010). On finding the natural number of topics with latent dirichlet allocation: Some observations. Pacific-Asia Conference on Knowledge Discovery and Data Mining içinde (s. 391-402). Springer. https://doi.org/10.1007/978-3-642-13657-3_43
  • Baeza-Yates, R. ve Ribeiro-Neto, B. (1999). Modern information retrieval. ACM Press.
  • Ballester, O. ve Penner, O. (2022). Robustness, replicability and scalability in topic modelling. Journal of Informetrics, 16(1). https://doi.org/10.1016/j.joi.2021.101224
  • Bayer, D. ve Michael, S. (2019). Exploring the daschle collection using text mining. arXiv. https://arxiv.org/pdf/1904.12623.pdf
  • Beel, J. ve Gipp, B. (2009). Google Scholar’s ranking algorithm: An introductory overview. B. Larsen ve J. Leta (Yay. haz.). Proceedings of the 12th International Conference on Scientometrics and Informetrics içinde (s. 230-241). International Society for Scientometrics and Informetrics. https://www.issi-society.org/proceedings/issi_2009/ISSI2009-proc-vol1_Aug2009_batch2-paper-1.pdf
  • Beel, J., Gipp, B., Langer, S. ve Breitinger, C. (2016). Research-paper recommender systems: A literature survey. International Journal on Digital Libraries, 17(4): 305-338.
  • Belter, C. W. (2017). A relevance ranking method for citation-based search results. Scientometrics, 112(2), 731-746. https://doi.org/10.1007/s11192-017-2406-y
  • Bichteler, J. ve Eaton III, E.A. (1980). The combined use of bibliographic coupling and cocitation for document retrieval. Journal of the American Society for Information Science, 31(4): 278–282.
  • Blei, D. M. ve Lafferty, J. D. (2009). Topic models. A. Srivastava ve M. Sahami (Yay. haz.). Text Mining: Classification, Clustering and Applications içinde (s. 71-94). CRC Press, Taylor & Francis. http://www.cs.columbia.edu/~blei/papers/BleiLafferty2009.pdf
  • Blei, D. M., Ng, A. Y. ve Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993-1022. https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf?TB_iframe=true&width=370.8&height=658.8
  • Bornmann, L., Haunschild, R. ve Mutz, R. (2021). Growth rates of modern science: A latent piecewise growth curve approach to model publication numbers from established and new literature databases. Humanities and Social Sciences Communications, 8(1), 1-15. https://doi.org/10.1057/s41599-021-00903-w
  • Boyd-Graber, J. ve Blei, D. M. (2010). Syntactic topic models. arXiv. https://arxiv.org/pdf/1002.4665.pdf
  • Bradley, K. ve Smyth, B. (2001). Improving recommendation diversity. D. O'Donoghue (Yay. haz.) Proceedings of the Twelfth Irish Conference on Artificial Intelligence and Cognitive Science içinde (s. 141-152). NUIM Department of Computer Science. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.8.5232&rep=rep1&type=pdf
  • Cambria, E. ve White, B. (2014). Jumping NLP curves: A review of natural language processing research. IEEE Computational Intelligence Magazine, 9(2), 48-57. https://doi.org/10.1109/MCI.2014.2307227
  • Cao, J., Xia, T., Li, J., Zhang, Y. ve Tang, S. (2009). A density-based method for adaptive LDA model selection. Neurocomputing, 72(7-9), 1775-1781. https://doi.org/10.1016/j.neucom.2008.06.011
  • Carbonell, J. ve Goldstein, J. (1998). The use of MMR, diversity-based reranking for reordering documents and producing summaries. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval içinde (s. 335-336). Association for Computing Machinery. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.188.3982&rep=rep1&type=pdf
  • Carevic, Z. ve Mayr, P. (2014). Recommender systems using pennant diagrams in digital libraries. arXiv. https://arxiv.org/pdf/1407.7276v1.pdf
  • Carevic, Z. ve Schaer, P. (2014). On the connection between citation-based and topical relevance ranking: results of a pretest using iSearch. Proceedings of the First Workshop on Bibliometric-enhanced Information Retrieval co-located with 36th European Conference on Information Retrieval (ECIR 2014) içinde (s. 37-44). Springer-Verlag. https://ceur-ws.org/Vol-1143/paper5.pdf
  • Carroll, M. (2018). Changes in media coverage of GCSEs from 1988 to 2017. Cambridge. https://www.cambridgeassessment.org.uk/Images/504456-changes-in-media-coverage-of-gcses-from-1988-to-2017.pdf
  • Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. L. ve Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. Advances in Neural Information Processing Systems içinde (s. 288-296). MIT Press. https://proceedings.neurips.cc/paper/2009/file/f92586a25bb3145facd64ab20fd554ff-Paper.pdf
  • Chen, M. ve Décary, M. (2018). A Cognitive-based semantic approach to deep content analysis in search engines. 2018 IEEE 12th International Conference on Semantic Computing (ICSC) içinde (s. 131-139). IEEE. https://doi.ieeecomputersociety.org/10.1109/ICSC.2018.00027
  • Chen, Z. ve Liu, B. (2014). Mining topics in documents: Standing on the shoulders of big data. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining içinde (s. 1116-1125). ACM. https://dl.acm.org/doi/pdf/10.1145/2623330.2623622
  • Cooper, W. S. (1988). Getting beyond boole. Information Processing & Management, 24(3), 243-248. https://doi.org/10.1016/0306-4573(88)90091-X
  • Croft W. B. (2002). Combining approaches to information retrieval. W.B. Croft (Yay. haz.). Advances in Information Retrieval. The Information Retrieval Series, vol 7. içinde (s. 1-35). Springer, https://doi.org/10.1007/0-306-47019-5_1
  • Danilov, M. (2005). Experimental review on pentaquarks. arXiv. https://arxiv.org/abs/hep-ex/0509012
  • Deveaud, R., SanJuan, E. ve Bellot, P. (2014). Accurate and effective latent concept modeling for ad hoc information retrieval. Document Numérique, 17(1), 61-84. https://doi.org/10.3166/dn.17.1.61-84
  • Ekinci, E. ve İlhan Omurca, S. (2020). Concept-LDA: Incorporating Babelfy into LDA for aspect extraction. Journal of Information Science, 46(3), 406-418. https://doi.org/10.1177/0165551519845854
  • Ganguly, D. ve Jones, G. J. (2018). A non-parametric topical relevance model. Information Retrieval Journal, 21(5), 449-479. https://doi.org/10.1007/s10791-018-9329-y
  • Giustolisi, O., Ridolfi, L. ve Simone, A. (2020). Embedding the intrinsic relevance of vertices in network analysis: the case of centrality metrics. Scientific Reports, 10(1). https://doi.org/10.1038/s41598-020-60151-x
  • Gläser, J., Glänzel, W. ve Scharnhorst, A. (2017). Same data—different results? Towards a comparative approach to the identification of thematic structures in science. Scientometrics, 111(2), 981-998. https://doi.org/10.1007/s11192-017-2296-z
  • Griffiths, T. L. ve Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(1), 5228-5235. https://doi.org/10.1073/pnas.0307752101
  • Guillemette, J., Simms, B., Zhou, D. ve Mills, S. (2017). Applying latent dirichlet allocation to yelp reviews. https://people.math.carleton.ca/~smills/2017-18/STAT4601-5703/Research%20Projects/2018%20Submissions/GuillemetteSimmsZhouD/Applying%20LDA.pdf
  • Guo, J., Fan, Y., Ai, Q. ve Croft, W. B. (2016). A deep relevance matching model for ad-hoc retrieval. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management içinde (s. 55-64). ACM. https://doi.org/10.1145/2983323.2983769
  • Guo, Z., Zhang, Z. M., Zhu, S., Chi, Y. ve Gong, Y. (2013). A two-level topic model towards knowledge discovery from citation networks. IEEE Transactions on Knowledge and Data Engineering, 26(4), 780-794. https://doi.org/10.1109/TKDE.2013.56
  • Han, X. (2020). Evolution of research topics in LIS between 1996 and 2019: an analysis based on latent dirichlet allocation topic model. Scientometrics, 125(3), 2561-2595. https://doi.org/10.1007/s11192-020-03721-0
  • Hecking, T. ve Leydesdorff, L. (2018). Topic modelling of empirical text corpora: Validity, reliability, and reproducibility in comparison to semantic maps. arXiv. https://arxiv.org/pdf/1806.01045.pdf
  • Herlocker, J.L., Konstan, J. A., Terveen, L. G. ve Riedl, J. T. (2004) Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems, 22(1), 5-53. https://doi.org/10.1145/963770.963772
  • Holliger, T. S. (2018). Strategic sourcing via category management: Helping air force installation contracting agency eat one piece of the elephant [Yayımlanmamış yüksek lisans tezi]. Air Force Institute of Technology. https://apps.dtic.mil/sti/pdfs/AD1056353.pdf
  • Huang, L., Liu, H., He, J. ve Du, X. (2016). Finding latest influential research papers through modeling two views of citation links. F. Li, K. Shim, K. Zheng ve G. Liu (Yay. haz.) Web Technologies and Applications APWeb 2016 içinde (s. 555-566). Springer, Cham. https://doi.org/10.1007/978-3-319-45814-4_45
  • Huang, X., Chen, C., Peng, C., Wu, X., Fu, L. ve Wang, X. (2018). Topic-sensitive influential paper discovery in citation network. D. Phung, V. Tseng, G. Webb, B. Ho, M. Ganji ve L. Rashidi (Yay. haz.). Advances in Knowledge Discovery and Data Mining içinde (s. 16-28). Springer, Cham. https://doi.org/10.1007/978-3-319-93037-4_2
  • Jin, R., Valizadegan, H. ve Li, H. (2008). Ranking refinement and its application to information retrieval. Proceedings of the 17th International Conference on World Wide Web içinde (s. 397-406). ACM. http://doi.org/10.1145/1367497.1367552
  • Ke, Q., Ferrara, E., Radicchi, F. ve Flammini, A. (2015). Defining and identifying Sleeping Beauties in science. Proceedings of the National Academy of Sciences, 112(24), 7426-7431. https://doi.org/10.1073/pnas.1424329112
  • Kessler, M. M. (1963). Bibliographic coupling between scientific papers. American Documentation, 14(1): 10-25
  • Knoth, P., Anastasiou, L., Charalampous, A., Cancellieri, M., Pearce, S., Pontika, N. ve Bayer, V. (2017). Towards effective research recommender systems for repositories. arXiv. https://arxiv.org/abs/1705.00578
  • Kucuktunc, O. ve Ferhatosmanoglu, H. (2011). λ-diverse nearest neighbors browsing for multidimensional data. IEEE Transactions on Knowledge and Data Engineering, 25(3), 481-493. https://doi.org/10.1109/TKDE.2011.251
  • Küçüktunç, O., Saule, E., Kaya, K. ve Çatalyürek, Ü. V. (2015). Diversifying citation recommendations. ACM Transactions on Intelligent Systems and Technology, 5(4), 1-21. https://doi.org/10.1145/2668106
  • Lei, M., Wang, J., Chen, B. ve Li, X. (2001). Improved relevance ranking in WebGather. Journal of Computer Science and Technology, 16(5), 410-417. https://doi.org/10.1007/bf02948958
  • Leydesdorff, L. ve Nerghes, A. (2017). Co‐word maps and topic modeling: A comparison using small and medium‐sized corpora (N< 1,000). Journal of the Association for Information Science and Technology, 68(4), 1024-1035. https://doi.org/10.1002/asi.23740
  • Li, C., Feng, H. ve Rijke, M. D. (2020). Cascading hybrid bandits: online learning to rank for relevance and diversity. Fourteenth ACM Conference on Recommender Systems içinde (s. 33-42). ACM. https://doi.org/10.1145/3383313.3412245
  • Li, W. ve McCallum, A. (2006). Pachinko allocation: DAG-structured mixture models of topic correlations. Proceedings of the 23rd International Conference on Machine Learning içinde (s. 577-584). Springer. https://doi.org/10.1145/1143844.1143917
  • Li, Y., He, J. ve Liu, H. (2017). Topic analysis and influential paper discovery on scientific publications. 2017 14th Web Information Systems and Applications Conference (WISA) içinde (s. 68-73). IEEE. https://doi.org/10.1109/WISA.2017.69
  • Liu, X., Wang, G. ve Zakirul Alam Bhuiyan, M. (2022). Re‐ranking with multiple objective optimization in recommender system. Transactions on Emerging Telecommunications Technologies, 33(1): e4398 https://doi.org/10.1002/ett.4398
  • Liu, X. Z. ve Fang, H. (2020). A comparison among citation-based journal indicators and their relative changes with time. Journal of Informetrics, 14(1), 1-17. https://doi.org/10.1016/j.joi.2020.101007
  • Lykke, M., Larsen, B., Lund, H. ve Ingwersen, P. (2010). Developing a test collection for the evaluation of integrated search. European Conference on Information Retrieval içinde (s. 627-630). Springer. https://doi.org/10.1007/978-3-642-12275-0_63
  • Ma, Z., Liu, Y., Yang, Z., Yang, J. ve Li, K. (2022). A parameter-free approach to lossless summarization of fully dynamic graphs. Information Sciences, 589, 376-394. https://doi.org/10.1016/j.ins.2021.12.116
  • Mahajan, M., Beeferman, D. ve Huang, X. D. (1999). Improved topic-dependent language modeling using information retrieval techniques. 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings içinde (s. 541-544). IEEE. https://doi.org/10.1109/ICASSP.1999.758182
  • Manning, C. ve Schütze, H. (2000). Foundations of statistical natural language processing. MIT Press. https://ics.upjs.sk/~pero/web/documents/pillar/Manning_Schuetze_Statistical NLP.pdf
  • Maron, M. E. ve Kuhns, J. L. (1960). On relevance, probabilistic indexing and information retrieval. Journal of the ACM, 7(3), 216-244. https://doi.org/10.1145/321033.321035
  • Marujo, L., Ribeiro, R., Gershman, A., De Matos, D.M., Neto, J.P. ve Carbonell, J. (2017). Event-based summarization using a centrality-as-relevance model. Knowledge and Information Systems, 50, 945–968. https://doi.org/10.1007/s10115-016-0966-4
  • Mayr, P. ve Mutschke, P. (2013). Bibliometric-enhanced retrieval models for big scholarly information systems. 2013 IEEE International Conference on Big Data içinde (s. 5-8). IEEE. https://doi.org/10.1109/BigData.2013.6691762
  • McNee, S. M., Riedl, J. ve Konstan, J. A. (2006). Being accurate is not enough: how accuracy metrics have hurt recommender systems. CHI'06 extended abstracts on human factors in computing systems içinde (s. 1097-1101). https://doi.org/10.1145/1125451.1125659
  • Meng, W., Yu, C. ve Liu, K. L. (2002). Building efficient and effective metasearch engines. ACM Computing Surveys (CSUR), 34(1), 48-89. https://doi.org/10.1145/505282.505284
  • Mizzaro, S. (1997). Relevance: The whole history. Journal of the American Society for Information Science, 48, 810-832. https://doi.org/10.1002/(SICI)1097-4571(199709)48:9<810::AID-ASI6>3.0.CO;2-U
  • Nguyen, T. ve Do, P. (2018). CitationLDA++ an extension of LDA for discovering topics in document network. Proceedings of the Ninth International Symposium on Information and Communication Technology içinde (s. 31-37). ACM. https://doi.org/10.1145/3287921.3287930
  • Nikita, M. (2020, 20 Nisan). Select number of topics for LDA. https://cran.r-project.org/web/packages/ldatuning/vignettes/topics.html
  • Nolasco, D. ve Oliveira, J. (2016). Detecting knowledge innovation through automatic topic labeling on scholar data. 2016 49th Hawaii International Conference on System Sciences (HICSS) içinde (s. 358-367). IEEE. https://doi.org/10.1109/HICSS.2016.51
  • Pao, M. L. (1993). Term and citation retrieval: A field study. Information Processing & Management. 29(1), 95-112. https://doi.org/10.1016/0306-4573(93)90026-A
  • Ponweiser, M. (2012). Latent dirichlet allocation in R. [Yayımlanmamış yüksek lisans tezi]. Viyana Üniversitesi. https://epub.wu.ac.at/id/eprint/3558
  • Rafols, I., Leydesdorff, L., O’Hare, A., Nightingale, P. ve Stirling, A. (2012). How journal rankings can suppress interdisciplinary research: A comparison between innovation studies and business & management. Research Policy, 41(7), 1262-1282. https://doi.org/10.1016/j.respol.2012.03.015
  • Ribeiro, R., ve de Matos, D.M. (2011). Revisiting Centrality-as-relevance: support sets and similarity as geometric proximity. Journal of Artificial Intelligence Research, 42, 275-308. https://doi.org/10.1613/jair.3387
  • Robertson, S. E. (1977). The probability ranking principle in IR. Journal of Documentation, 33(4), 294-304. https://doi.org/10.1108/eb026647
  • Rüdiger, M. S., Antons, D. ve Salge, T. O. (2021). The explanatory power of citations: a new approach to unpacking impact in science. Scientometrics, 126, 9779-9809. https://doi.org/10.1007/s11192-021-04103-w
  • Salton, G., Yang, C. ve Wong, A. (1975). A vector space model for automatic indexing. Communications of the ACM, 18, 613-620. https://doi.org/10.1145/361219.361220
  • Samraj, B. (2005). An exploration of a genre set: Research article abstracts and introductions in two disciplines. English for Specific Purposes, 24(2), 141-156. https://doi.org/10.1016/j.esp.2002.10.001
  • Saracevic, T. (2021). Relevance: In search of a theoretical foundation. D. H. Sonnenwald (Yay. haz.), Theory Development in the Information Sciences içinde (s. 141-163). University of Texas Press. https://doi.org/10.7560/308240-011
  • Sperber, D. ve Wilson, D. (1995). Relevance: Communication and cognition. Blackwell. https://monoskop.org/images/e/e6/Sperber_Dan_Wilson_Deirdre_Relevance _Communica_and_Cognition_2nd_edition_1996.pdf
  • Swanson, D. R. (1986a). Subjective versus objective relevance in bibliographic retrieval systems. The Library Quarterly, 56(4), 389-398. https://doi.org/10.1086/601800
  • Swanson, D. R. (1986b). Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine, 30(1):7-18. https://doi.org/10.1353/pbm.1986.0087
  • Thara, D. K., PremaSudha, B. G. ve Xiong, F. (2019). Auto-detection of epileptic seizure events using deep neural network with different feature scaling techniques. Pattern Recognition Letters, 128, 544-550. https://doi.org/10.1016/j.patrec.2019.10.029
  • Thompson, P. (2007). Looking back: On relevance, probabilistic indexing and information retrieval. Information Processing & Management, 44(2), 963-970. https://doi.org/10.1016/j.ipm.2007.10.002
  • Tonta, Y. (1995). Bilgi erişim sistemleri. Türk Kütüphaneciliği, 9(3), 302-314. https://eprints.rclis.org/9571/
  • Tonta, Y. ve Akbulut, M. (2021). Uluslararası dergilerde yayımlanan Türkiye adresli makalelerin atıf etkisini artıran faktörler. Türk Kütüphaneciliği, 35(3), 388-409. https://doi.org/10.24146/tk.933159
  • Vergoulis, T., Chatzopoulos, S., Kanellos, I., Deligiannis, P., Tryfonopoulos, C. ve Dalamagas, T. (2019). BIP! finder: Facilitating scientific literature search by exploiting impact-based ranking. Proceedings of the 28th ACM International Conference on Information and Knowledge Management içinde (s. 2937-2940). ACM. https://doi.org/10.1145/3357384.3357850
  • Verma, M., Yılmaz, E. ve Craswell, N. (2016). On obtaining effort based judgements for information retrieval. Proceedings of the 9th ACM International Conference on Web Search and Data Mining içinde (s. 277-286). ACM. https://doi.org/10.1145/2835776.2835840
  • Wang, X., Zhai, C. ve Roth, D. (2013). Understanding evolution of research themes: a probabilistic generative model for citations. Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining içinde (s. 1115-1123). ACM. https://doi.org/10.1145/2487575.2487698
  • White, H. D. (2007a). Combining bibliometrics, information retrieval, and relevance theory. Part 1: First examples of a synthesis. Journal of the American Society for Information Science and Technology, 58, 536-559. https://doi.org/10.1002/asi.20543
  • White, H. D. (2007b). Combining bibliometrics, information retrieval, and relevance theory. Part 2: Some implications for information science. Journal of the American Society for Information Science and Technology, 58, 583-605. https://doi.org/10.1002/asi.20542
  • White, H. D. (2009). Pennants for Strindberg and Persson. Celebrating scholarly communication studies: A festschrift for Olle Persson at his 60th birthday. Special volume of the E-newsletter of the International Society for Scientometrics and Informetrics, 5, 71-83. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.168.2055&rep=rep1&type=pdf#page=73
  • White, H. D. (2010). Some new tests of relevance theory in information science. Scientometrics, 83, 653-667. https://doi.org/10.1007/s11192-009-0138-3
  • White, H. D. (2015). Co-cited author retrieval and relevance theory: examples from the humanities. Scientometrics, 102(3), 2275-2299. https://doi.org/10.1007/s11192-014-1483-4
  • White, H. D. (2016). Bag of works retrieval: TF*IDF weighting of co-cited works. Proceedings of the 3rd Workshop on Bibliometric-Enhanced Information Retrieval (BIR2016) içinde (s. 63-72). https://ceur-ws.org/Vol-1567/paper7.pdf
  • White, H. D. (2018). Bag of works retrieval: TF*IDF weighting of co-cited works with a seed. International Journal of Digital Libraries, 19, 139-149. https://doi.org/10.1007/s00799-017-0217-7
  • White, H. D. ve McCain, K. W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972-1995. Journal of the American Society for Information Science, 49(4): 327-355. https://doi.org/10.1002/(SICI)1097-4571(19980401)49:4%3C327::AID-ASI4%3E3.0.CO;2-4
  • Wilson, P. (1978). Some fundamental concepts of information retrieval. Drexel Library Quarterly, 14(2), 10-24.
  • Wilson, D. ve Sperber, D. (2002). Relevance theory. G. Ward ve L. Horn (Yay. haz.) Handbook of pragmatics içinde (s. 1-55). Blackwell. https://jeannicod.ccsd.cnrs.fr/ijn_00000101/document
  • Wu, H. C., Luk, R. W., Wong, K. F. ve Kwok, K. L. (2007). A retrospective study of a hybrid document-context based retrieval model. Information Processing & Management, 43(5), 1308-1331. https://doi.org/10.1016/j.ipm.2006.10.009
  • Wu, J., Son, G. ve Wang, S. (2020). A competency mining method based on Latent Dirichlet Allocation (LDA) model. Journal of Physics: Conference Series (Vol. 1682, No. 1, p. 012059) içinde. IOP Publishing. https://iopscience.iop.org/article/10.1088/1742-6596/1682/1/012059/meta
  • Xia, H., Li, J., Tang, J. ve Moens MF. (2012). Plink-LDA: Using link as prior information in topic modeling. S. Lee, Z. Peng, X. Zhou, Y. S. Moon, R. Unland ve J. Yoo (Yay. haz.) Database Systems for Advanced Applications içinde (s. 213-227). Springer. https://doi.org/10.1007/978-3-642-29038-1_17
  • Xie, X., Liang, Y., Li, X. ve Tan, W. (2019). CuLDA_CGS: Solving large-scale LDA problems on GPUs. Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming içinde (s. 435-436). ACM. https://doi.org/10.1145/3293883.3301496
  • Yang, H. T., Ju, J. H., Wong, Y. T., Shmulevich, I. ve Chiang, J. H. (2017). Literature-based discovery of new candidates for drug repurposing. Briefings in Bioinformatics, 18(3), 488-497. https://doi.org/10.1093/bib/bbw030
  • Yang, L., Ji, D. ve Leong, M. (2007). Document reranking by term distribution and maximal marginal relevance for Chinese information retrieval. Information Processing & Management, 43(2), 315-326. https://doi.org/10.1016/j.ipm.2006.07.011
  • Yılmaz, E., Verma, M., Craswell, N., Radlinski, F. ve Bailey, P. (2014). Relevance and effort: An analysis of document utility. Proceedings of the 23rd ACM International Conference on Information and Knowledge Management içinde (s. 91-100). ACM. https://doi.org/10.1145/2661829.2661953
  • Zarrinkalam, F. ve Kahani, M. (2012). A new metric for measuring relatedness of scientific papers based on non-textual features. Intelligent Information Management, 4(4), 99-107. https://www.scirp.org/pdf/IIM20120400001_98298896.pdf
  • Zhou, H. K., Yu, H. M. ve Hu, R. (2017). Topic discovery and evolution in scientific literature based on content and citations. Frontiers of Information Technology & Electronic Engineering, 18(10), 1511-1524. https://doi.org/10.1631/FITEE.1601125
  • Zou, L., Liu, X., Buntine, W. ve Liu, Y. (2021). Citation context-based topic models: discovering cited and citing topics from full text. Library Hi Tech, 39(4), 1063-1083. https://doi.org/10.1108/LHT-01-2021-0041

Ayrıntılar

Birincil Dil Türkçe
Konular Bilgi, Belge Yönetimi
Bölüm Araştırma Makaleleri
Yazarlar

Müge AKBULUT> (Sorumlu Yazar)
Ankara Yıldırım Beyazıt Üniversitesi
0000-0003-0026-6485
Türkiye


Yaşar TONTA>
HACETTEPE ÜNİVERSİTESİ
0000-0003-0285-1338
Türkiye

Teşekkür iSearch derlemiyle ilgili yardımları için iSearch Team’e (Peter Ingwersen, Birger Larsen, Haakon Lund ve Marianne Lykke), çalışmanın önceki sürümünü okuyarak değerli önerilerde bulunan Prof. Dr. Umut Al ve Prof. Dr. Fazlı Can’a teşekkür ederiz.
Yayımlanma Tarihi 30 Haziran 2022
Başvuru Tarihi 25 Ocak 2022
Kabul Tarihi 10 Nisan 2022
Yayınlandığı Sayı Yıl 2022, Cilt 36, Sayı 2

Kaynak Göster

APA Akbulut, M. & Tonta, Y. (2022). İlgi Sıralamalarının Artırımlı Olarak Geliştirilmesi: Pennant Erişimle Desteklenen Yeni Bir Yöntem Önerisi . Türk Kütüphaneciliği , 36 (2) , 169-203 . DOI: 10.24146/tk.1062751