A Study of Chance-Corrected Agreement Coefficients for the Measurement of Multi-Rater Consistency

Xie, Zheng orcid iconORCID: 0000-0001-8649-6235, Gadepalli, Chaitanya and Cheetham, Barry (2018) A Study of Chance-Corrected Agreement Coefficients for the Measurement of Multi-Rater Consistency. International Journal of Simulation: Systems, Science & Technology, 19 (2). 10.1-10.9. ISSN 1473-8031

[thumbnail of Version of Record]
Preview
PDF (Version of Record) - Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

226kB

Official URL: http://ijssst.info/Vol-19/No-2/cover-19-2.htm

Abstract

Chance corrected agreement coefficients such as the Cohen and Fleiss Kappas are commonly used for the measurement
of consistency in the decisions made by clinical observers or raters. However, the way that they estimate the probability of
agreement (Pe) or cost of disagreement (De) 'by chance' has been strongly questioned, and alternatives have been proposed, such as the Aickin Alpha coefficient and the Gwet AC1 and AC2 coefficients. A well known paradox illustrates deficiencies of the Kappa coefficients which may be remedied by scaling Pe or De according to the uniformity of the scoring. The AC1 and AC2 coefficients result from the application of this scaling to the Brennan-Prediger coefficient which may be considered a simplified form of Kappa. This paper examines some commonly used multi-rater agreement coefficients including AC1 and AC2. It then proposes an alternative subject-by-subject scaling approach that may be applied to weighted and unweighted multi-rater Cohen and Fleiss Kappas and also Intra-Class Correlation (ICC) coefficients.


Repository Staff Only: item control page