Solved
Contributor
Posts: 39

# Need help aggregating data for Kappa statistic

Hello all, I have the following data format. Each subject could be rated in either a Y or R category.

Subject
RaterCategorization
11Y
12Y
13R
21R
22R
31Y
32Y
41Y
42
43R

I would like to aggregate the data to the following count format:

SubjectCat_YCat_R
121
202
320
411

I am not sure I the best way would be to do this via a datastep or some proc (e.g means, freq, etc). In total I have ratings for about 60 subjects so I would like to automate the process as much as possible. Any suggestions?

Thanks!

Accepted Solutions
Solution
‎03-18-2014 04:33 PM
PROC Star
Posts: 8,169

## Re: Need help aggregating data for Kappa statistic

I, too, think that sql would be the easiest.  However, if you prefer a datastep and your data are already sorted by subject, then you could use:

data want (keep=subject cat_;

set have;

by subject;

if first.subject then do;

cat_y=0;

cat_r=0;

end;

if categorization eq 'R' then cat_r+1;

else if categorization eq 'Y' then cat_y+1;

if last.subject then output;

run;

All Replies
Posts: 3,167

## Re: Need help aggregating data for Kappa statistic

proc sql;
create table want as
select subject, sum(Categorization='R') as cat_r,
sum(Categorization='Y') as cat_y
from have

group by subject;
quit;

Solution
‎03-18-2014 04:33 PM
PROC Star
Posts: 8,169

## Re: Need help aggregating data for Kappa statistic

I, too, think that sql would be the easiest.  However, if you prefer a datastep and your data are already sorted by subject, then you could use:

data want (keep=subject cat_;

set have;

by subject;

if first.subject then do;

cat_y=0;

cat_r=0;

end;

if categorization eq 'R' then cat_r+1;

else if categorization eq 'Y' then cat_y+1;

if last.subject then output;

run;

Contributor
Posts: 39

## Re: Need help aggregating data for Kappa statistic

Thank you Art and Hai.kuo.

🔒 This topic is solved and locked.