06-27-2016 10:30 AM
I have two data sets where multiple raters have rated multiple x-rays into 3 categories. All raters have rated all x-rays and there is no missing data.
I've calculated overall kappas using magree for each of the two data sets. Is anyone aware of a way to statistically compare these two kappas? The macro produces a standard error for the overall kappa, which is very tiny (presumably because there are >200 raters in each group). Would it be proper to use this SE to create confidence intervals? For some reason the macro doesn't create confidence intervals, which makes me concerned I'm incorrect in doing this. Are there other ways to create these confidence intervals? I've seen one example of bootstrapping to create CIs in this situation as well - not sure which is most appropriate.
06-27-2016 10:47 AM
You mention a macro for computing the kappas. Is it the %MAGREE macro that is provided by SAS? Can you provide a link or reference for the bootstrapping example that you are considering?
06-27-2016 10:52 AM
Yes - the %magree macro from SAS is what I used.
This is what I've read about using bootstrapping to generate CIs for kappas: