Research Article
BibTex RIS Cite

Modelling of Factors Influencing the Citation Counts in Statistics

Year 2022, , 157 - 167, 30.09.2022
https://doi.org/10.21541/apjess.1075099

Abstract

Citation is considered as the most popular quality assessment metric for scientific papers, and it is thus important to determine what factors promote the citation count of a paper in comparison to the others in the same field. The main aim of this study is to model the citation counts of the research published in SCI or SCI-Expanded journals of Statistics field with the growing number of scientific works in Turkey. It is well known that the right-skewed nature of the counts makes the classical regression modelling inappropriate, even if the log transformation of counts is applied [1]. Due to the fact that distribution of citation counts involves a great number of zeros, this study serves for an additional aim that is to model the counts with advanced discrete regression models for a more precise prediction [2]. Data collected for this study consist of the citation counts of all scientific papers produced by 261 Statisticians in between 2005-2017. Discrete models varying from Poisson to Zero-Inflated or Hurdle were constructed by possible influential factors, such as the publication age, the number of references, the journal category etc. Predictive performances of alternative discrete models were compared via AIC and Vuong test [3]. Results suggested that Zero Inflated Negative Binomial and Hurdle Negative Binomial mixture models are the best forms to predict the zero inflation of citation counts [4]. In addition, the influential factors of the final model were interpreted to make some suggestions to Statisticians to increase the citation counts of their work.

Supporting Institution

Sinop University

Project Number

FEF-1901-16-07

References

  • R. O’Hara and J. Kotze, “Do not log-transform count data”, Methods in Ecology and Evolution, vol.1, pp. 118-122, 2010.
  • L. Zha, D. Lord and Y. Zou,” The Poisson inverse Gaussian (PIG) generalized linear regression model for analyzing motor vehicle crash data”, Journal of Transportation Safety & Security, vol. 8, no. 1, pp. 18-35, 2016.
  • F. Didegah and M. Thelwall, “Which factors help authors produce the highest impact research? Collaboration, journal and document properties”, Journal of Informetrics, vol. 7, no. 4, pp. 861-873, 2013.
  • W.J. Low, P. Wilson and M. Thelwall, “Stopped sum models and proposed variants for citation data”, Scientometrics, vol. 107, pp. 369-384, 2016.
  • G. Di Vaio, D. Waldenström and J. Weisdorf, ”Citation success: Evidence from economic history journal publications”, Explorations in Economic History, vol. 49, no. 1, pp. 92-104, 2012.
  • J. Mingers, F. Macri and D. Petrovici, “Using the h-index to measure the quality of journals in the field of Business and Management”, Information Processing & Management, vol. 48, no. 2, pp. 234-241, 2012.
  • C. Lokker, K.A. McKibbon, R.J. McKinlay, N.L. Wilczynski and R.B. Haynes, “Prediction of Citation Counts for Clinical Articles at Two Years Using Data Available Within Three Weeks of Publication: Retrospective Cohort Study”, Bmj, vol. 336, no. 7645, pp. 655-657, 2008.
  • C. Chen, “Predictive effects of structural variation on citation counts” Journal of the Association for Information Science and Technology, vol. 63, no. 3, pp. 431-449, 2012.
  • Z. Hu and Y. Wu, “Regularity in the time-dependent distribution of the percentage of never- cited papers: An empirical pilot study based on the six journals”, Journal of Informetrics, vol. 8, no. 1, pp. 136–146, 2014.
  • M. Thelwall and P. Wilson, “Regression for Citation Data: An Evaluation of Different Methods”, Journal of Informetrics, vol. 8, no. 4, pp. 963–971, 2014.
  • D. Maliniak, R. Powers and B.F. Walter, “The gender citation gap in international relations” International Organization, vol. 67, no. 4, pp. 889-922, 2013.
  • L.J. Zigerell, “Is The Gender Citation Gap in International Relations Driven by Elite Papers?” Research & Politics, April-June, 1-7, 2015.
  • J.B. Santos and F.J.O. Irizo, “Modelling Citation Age Data with Right Censoring”, Scientometrics, vol. 62, no. 3, pp. 329-342, 2005.
  • Y. Qian, W. Rong, N. Jiang, J. Tang and Z. Xiong, “Citation regression analysis of computer science publications in different ranking categories and subfields”, Scientometrics, vol. 108, pp. 1–24, 2017.
  • P. Ahlgren, C. Colliander and P. Sjogarde, “Exploring the relation between referencing practices and citation impact: A large-scale study based on Web of Science data”, Journal of the Association for Information Science and Technology, vol. 69, no.5, pp. 728–743, 2018.
  • J. Hardin and J. Hilbe, “Generalized linear models and extensions” College Station, Texas, USA: Stata Corporation, 2012.
  • E. Arıcan, “Nitel yanıt değişkene sahip regresyon modellerinde tahmin yöntemleri” Master Thesis. Cukurova University, Institute of Science, 2010. https://tez.yok.gov.tr/UlusalTezMerkezi/tezSorguSonucYeni.jsp.
Year 2022, , 157 - 167, 30.09.2022
https://doi.org/10.21541/apjess.1075099

Abstract

Project Number

FEF-1901-16-07

References

  • R. O’Hara and J. Kotze, “Do not log-transform count data”, Methods in Ecology and Evolution, vol.1, pp. 118-122, 2010.
  • L. Zha, D. Lord and Y. Zou,” The Poisson inverse Gaussian (PIG) generalized linear regression model for analyzing motor vehicle crash data”, Journal of Transportation Safety & Security, vol. 8, no. 1, pp. 18-35, 2016.
  • F. Didegah and M. Thelwall, “Which factors help authors produce the highest impact research? Collaboration, journal and document properties”, Journal of Informetrics, vol. 7, no. 4, pp. 861-873, 2013.
  • W.J. Low, P. Wilson and M. Thelwall, “Stopped sum models and proposed variants for citation data”, Scientometrics, vol. 107, pp. 369-384, 2016.
  • G. Di Vaio, D. Waldenström and J. Weisdorf, ”Citation success: Evidence from economic history journal publications”, Explorations in Economic History, vol. 49, no. 1, pp. 92-104, 2012.
  • J. Mingers, F. Macri and D. Petrovici, “Using the h-index to measure the quality of journals in the field of Business and Management”, Information Processing & Management, vol. 48, no. 2, pp. 234-241, 2012.
  • C. Lokker, K.A. McKibbon, R.J. McKinlay, N.L. Wilczynski and R.B. Haynes, “Prediction of Citation Counts for Clinical Articles at Two Years Using Data Available Within Three Weeks of Publication: Retrospective Cohort Study”, Bmj, vol. 336, no. 7645, pp. 655-657, 2008.
  • C. Chen, “Predictive effects of structural variation on citation counts” Journal of the Association for Information Science and Technology, vol. 63, no. 3, pp. 431-449, 2012.
  • Z. Hu and Y. Wu, “Regularity in the time-dependent distribution of the percentage of never- cited papers: An empirical pilot study based on the six journals”, Journal of Informetrics, vol. 8, no. 1, pp. 136–146, 2014.
  • M. Thelwall and P. Wilson, “Regression for Citation Data: An Evaluation of Different Methods”, Journal of Informetrics, vol. 8, no. 4, pp. 963–971, 2014.
  • D. Maliniak, R. Powers and B.F. Walter, “The gender citation gap in international relations” International Organization, vol. 67, no. 4, pp. 889-922, 2013.
  • L.J. Zigerell, “Is The Gender Citation Gap in International Relations Driven by Elite Papers?” Research & Politics, April-June, 1-7, 2015.
  • J.B. Santos and F.J.O. Irizo, “Modelling Citation Age Data with Right Censoring”, Scientometrics, vol. 62, no. 3, pp. 329-342, 2005.
  • Y. Qian, W. Rong, N. Jiang, J. Tang and Z. Xiong, “Citation regression analysis of computer science publications in different ranking categories and subfields”, Scientometrics, vol. 108, pp. 1–24, 2017.
  • P. Ahlgren, C. Colliander and P. Sjogarde, “Exploring the relation between referencing practices and citation impact: A large-scale study based on Web of Science data”, Journal of the Association for Information Science and Technology, vol. 69, no.5, pp. 728–743, 2018.
  • J. Hardin and J. Hilbe, “Generalized linear models and extensions” College Station, Texas, USA: Stata Corporation, 2012.
  • E. Arıcan, “Nitel yanıt değişkene sahip regresyon modellerinde tahmin yöntemleri” Master Thesis. Cukurova University, Institute of Science, 2010. https://tez.yok.gov.tr/UlusalTezMerkezi/tezSorguSonucYeni.jsp.
There are 17 citations in total.

Details

Primary Language English
Subjects Software Engineering (Other)
Journal Section Research Articles
Authors

Olcay Alpay 0000-0003-1446-0801

Nazan Danacıoğlu 0000-0001-8014-6920

Emel Çankaya 0000-0002-2892-2520

Project Number FEF-1901-16-07
Publication Date September 30, 2022
Submission Date February 18, 2022
Published in Issue Year 2022

Cite

IEEE O. Alpay, N. Danacıoğlu, and E. Çankaya, “Modelling of Factors Influencing the Citation Counts in Statistics”, APJESS, vol. 10, no. 3, pp. 157–167, 2022, doi: 10.21541/apjess.1075099.

Academic Platform Journal of Engineering and Smart Systems