- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I've got data on one judge who rated 100 MRIs 5 times each. The options for the rating are "M", "L", "M and L" or "Neither", and there is no natural ordering to these ratings. Is there a way to get an intra-rater reliability for these data?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Look up topic "Tests and Measures of Agreement" in proc freq documentation. Cohen's Kappa is available. - PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
My understanding of Cohen's Kappa is that it is a test of the inter-rater reliability - the agreement between two raters. I'm not having any luck finding documentation of anyone using it for intra-rater reliability. The investigator is asking how internally consistent this one rater is, using 5 repetitions of this 4-level nominal rating system.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Weighted Kappa applies to more than two raters (or independent ratings, at different times by the same rater). SAS even computes exact and stratified versions, with confidence intervals and a strata homogeneity test. Unless the investigator can be more specific as to what is meant by reliability, weighted kappa could well serve as a measure of consistency. Using strata, you could evaluate the progress of a rater or compare the level of difficulty of different types of MRIs.
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you treat this as a problem of testing agreement among 5 raters, you can use the multiple rater kappa provided by the MAGREE macro: