Programming the statistical procedures from SAS

Clopper-Pearson Confidence Interval for Average Data

Reply
Regular Contributor
Posts: 221

Clopper-Pearson Confidence Interval for Average Data

I have three 2x2 contingency tables where each tables displays the agreement counts between 2 readers on a dichotomous test.  I have been asked to average the agreement rates, which I can easily do.  But then I was asked to put a confidence interval around the values.  The average agreement would simply be (0.76+0.67+0.80)/3 = 0.7433.  Does anyone know how I could average the confidence interval?  Can I simply average the lower bounds and upper bounds to get a new confidence interval?  Can I average the cell counts from the 3 tables and make a new table from which I can generate the Clopper-Pearson CI?  (I have attempted this, but my agreement rate didn't come out quite the same, so I abandoned that whole plan).

Test 1
Reader 2
Reader 1PositiveNegative
Positive2311
Negative114

Agreement Rate = 76%

95% CI = (0.6113, 0.8666)

Test 2
Reader 2
Reader 1PositiveNegative
Positive2113
Negative312

Agreement Rate = 67%

95% CI = (0.5246, 0.8005)

Test 3
Reader 2
Reader 1PositiveNegative
Positive195
Negative520

Agreement Rate = 80%

95% CI = (0.6566, 0.8976)

Respected Advisor
Posts: 2,655

Re: Clopper-Pearson Confidence Interval for Average Data

How about using kappa as the measure of agreement?  PROC FREQ will then give CI's on both the individual tables and the overall value.

data one;
input test reader1 $ reader2 $ weight;
datalines;
1 P P 23
1 P N 11
1 N P 1
1 N N 14
2 P P 21
2 P N 13
2 N P 3
2 N N 12
3 P P 19
3 P N 5
3 N P 5
3 N N 20
;

proc freq data=one;
tables test*reader1*reader2;
weight weight;
test agree;
run;

I know that kappa is NOT the same as the agreement parameter you calculated, but it is widely used as a measure of rater (test) agreement, and has better statistical properties.

Steve Denham

Regular Contributor
Posts: 221

Re: Clopper-Pearson Confidence Interval for Average Data

I have done kappa statistics (as well as PABAK), but my CEO is more into observed agreement rates since they are more easily understandable to our clinicians than kappa statistics.  We do report both, but we are still interested in averaging the values.  I didn't specify (which I should done), but these are pairwise agreement rates.  There were only 3 readers that read each of the 49 subjects.  Test 1 is Reader 1 vs. Reader 2; Test 2 is Reader 1 vs. Reader 3; and Test 3 is Reader 2 vs. Reader 3.  Can you suggest a better methodology than averaging them?  We have done a 2-reader agreement and a 3-reader agreement (the number of times in which 2 readers and 3 readers make the same call, respectively).  We have also done a method where we randomly selected 2 of the 3 readers for each subject, and then ran our agreement statistics based on that outcome.

Respected Advisor
Posts: 2,655

Re: Clopper-Pearson Confidence Interval for Average Data

I think there is a technical term for this, but it boils down to !&%$**!##.  That just sucks about "more easily understandable".

Three way agreement is tough.  Maybe a generalized linear model where reader is a repeated factor on each of the 49 subjects, and then calculating an ICC, but I am definitely starting to feel outside my comfort zone on this.

Steve Denham

Regular Contributor
Posts: 221

Re: Clopper-Pearson Confidence Interval for Average Data

I have been using Fleiss' kappa along with ICC (they appear to be nearly identical) when dealing with more than 2 readers.  I think what it comes down to is this: we are trying to decide what the probability is that any 2 randomly selected readers (from a pool of trained readers) will agree on a patient's status--positive or negative.  This can be used to explain to the clinicians, but it will also help us in planning a study that will validate our diagnostic for the FDA.

Respected Advisor
Posts: 2,655

Re: Clopper-Pearson Confidence Interval for Average Data

That is an interesting problem.  Reader as a random effect...

How interpretable would the following be?

proc glimmix;

class reader subjid;

model response=/dist=binomial;

random intercept reader/subject=subjid;

run;

and then calculating the ICC from the variance components.  Not quite it--I think you'll need variance due to each reader, so maybe

random intercept/subject=subjid group=reader;

would be better.

Steve Denham

Message was edited by: Steve Denham

Ask a Question
Discussion stats
  • 5 replies
  • 324 views
  • 0 likes
  • 2 in conversation