07-11-2012 03:48 PM
I've got data on one judge who rated 100 MRIs 5 times each. The options for the rating are "M", "L", "M and L" or "Neither", and there is no natural ordering to these ratings. Is there a way to get an intra-rater reliability for these data?
07-12-2012 09:21 AM
My understanding of Cohen's Kappa is that it is a test of the inter-rater reliability - the agreement between two raters. I'm not having any luck finding documentation of anyone using it for intra-rater reliability. The investigator is asking how internally consistent this one rater is, using 5 repetitions of this 4-level nominal rating system.
07-12-2012 11:05 AM
Weighted Kappa applies to more than two raters (or independent ratings, at different times by the same rater). SAS even computes exact and stratified versions, with confidence intervals and a strata homogeneity test. Unless the investigator can be more specific as to what is meant by reliability, weighted kappa could well serve as a measure of consistency. Using strata, you could evaluate the progress of a rater or compare the level of difficulty of different types of MRIs.
08-30-2012 01:14 PM
If you treat this as a problem of testing agreement among 5 raters, you can use the multiple rater kappa provided by the MAGREE macro: