[2] Yilmaz, A.E., Saracbasi, T., Assessing Agreement between Raters from the Point of Coefficients and Log-linear Models, Journal of Data Science, 14 (1) (2017) 1–24.
[4] Cohen, J., Weighted Kappa: Nominal Scale Agreement with Provision for Scaled Disagreement or Partial Credit, Psychological Bulletin, 70 (4) (1968) 213–220.
[5] Gwet, K.L. Handbook of Inter-rater Reliability, The Definitive Guide to Measuring the Extent of Agreement among Raters. 3rd ed. Maryland: Advanced Analytics, LLC, (2002).
[6] Cicchetti, D., Allison, T., A New Procedure for Assessing Reliability of Scoring EEG Sleep Recordings, American Journal EEG Technology, 11 (3) (1971) 101–109.
[7] Fleiss, J.L., Cohen, J., The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measure of Reliability, Educational and Psychological Measurement, 33 (3) (1973) 613–619.
[8] Landis, J.R., Koch, G.G., The Measurement of Observer Agreement for Categorical Data, Biometrics, 33 (1) (1977a) 159–174.
[9] Darroch, J.N., McCloud, P.I., Category Distinguishability and Observer Agreement, Australian Journal of Statistics, 28 (3) (1986) 371–388.
[10] Perkins, S.M., Becker, M.P., Assessing Rater Agreement using Marginal Association Models, Statistics in Medicine, 21 (12) (2002) 1743–1760.
[11] Yilmaz, A.E., Saracbasi, T., Agreement and Adjusted Degree of Distinguishability for Square Contingency Tables, Hacettepe Journal of Mathematics and Statistics, 48 (2) (2019) 592–604.
[12] Holmquist, N.D., McMahon, C.A., Williams, O.D., Variability in Classification of Carcinoma in Situ of the Uterine Cervix, Archives of Pathology, 84 (4) (1967) 334–345.
[13] Landis, J.R., Koch, G.G., An Application of Hierarchical Kappa-type Statistics in the Assessment of Majority Agreement among Multiple Observers, Biometrics, 33 (2) (1977b) 363–374.
[14] Becker, M.P., Agresti, A., Log-linear Modelling of Pairwise Interobserver Agreement on a Categorical Scale, Statistics in Medicine, 33 (1) (1992) 101–114.
[15] Agresti, A., Categorical Data Analysis. New York: John Wiley and Sons, (2002).
[16] Light, R.J., Measures of Response Agreement for Qualitative Data: Some Generalizations and Alternatives, Psychological Bulletin, 76 (5) (1971) 365-377.
How to use adjusted degree of distinguishability and inter-rater reliability simultaneously?
Year 2021,
Volume: 42 Issue: 3, 743 - 750, 24.09.2021
When the categories of a square contingency table are ordinal, weighted kappa or Gwet’s AC2 coefficients are used to summarize the degree of reliability between two raters. In addition, investigating the reliability among raters, the term category distinguishability should be considered. The study aims to assess the inter-rater reliability and category distinguishability in ordinal rating scales together. The weighted kappa, AC2, and adjusted degree of distinguishability coefficients are applied to pathology data. The results are discussed over the pathologist pairs.
[2] Yilmaz, A.E., Saracbasi, T., Assessing Agreement between Raters from the Point of Coefficients and Log-linear Models, Journal of Data Science, 14 (1) (2017) 1–24.
[4] Cohen, J., Weighted Kappa: Nominal Scale Agreement with Provision for Scaled Disagreement or Partial Credit, Psychological Bulletin, 70 (4) (1968) 213–220.
[5] Gwet, K.L. Handbook of Inter-rater Reliability, The Definitive Guide to Measuring the Extent of Agreement among Raters. 3rd ed. Maryland: Advanced Analytics, LLC, (2002).
[6] Cicchetti, D., Allison, T., A New Procedure for Assessing Reliability of Scoring EEG Sleep Recordings, American Journal EEG Technology, 11 (3) (1971) 101–109.
[7] Fleiss, J.L., Cohen, J., The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measure of Reliability, Educational and Psychological Measurement, 33 (3) (1973) 613–619.
[8] Landis, J.R., Koch, G.G., The Measurement of Observer Agreement for Categorical Data, Biometrics, 33 (1) (1977a) 159–174.
[9] Darroch, J.N., McCloud, P.I., Category Distinguishability and Observer Agreement, Australian Journal of Statistics, 28 (3) (1986) 371–388.
[10] Perkins, S.M., Becker, M.P., Assessing Rater Agreement using Marginal Association Models, Statistics in Medicine, 21 (12) (2002) 1743–1760.
[11] Yilmaz, A.E., Saracbasi, T., Agreement and Adjusted Degree of Distinguishability for Square Contingency Tables, Hacettepe Journal of Mathematics and Statistics, 48 (2) (2019) 592–604.
[12] Holmquist, N.D., McMahon, C.A., Williams, O.D., Variability in Classification of Carcinoma in Situ of the Uterine Cervix, Archives of Pathology, 84 (4) (1967) 334–345.
[13] Landis, J.R., Koch, G.G., An Application of Hierarchical Kappa-type Statistics in the Assessment of Majority Agreement among Multiple Observers, Biometrics, 33 (2) (1977b) 363–374.
[14] Becker, M.P., Agresti, A., Log-linear Modelling of Pairwise Interobserver Agreement on a Categorical Scale, Statistics in Medicine, 33 (1) (1992) 101–114.
[15] Agresti, A., Categorical Data Analysis. New York: John Wiley and Sons, (2002).
[16] Light, R.J., Measures of Response Agreement for Qualitative Data: Some Generalizations and Alternatives, Psychological Bulletin, 76 (5) (1971) 365-377.
Yılmaz, A. E. (2021). How to use adjusted degree of distinguishability and inter-rater reliability simultaneously?. Cumhuriyet Science Journal, 42(3), 743-750. https://doi.org/10.17776/csj.898192