Hello all, I have the following data format. Each subject could be rated in either a Y or R category.
Subject | Rater | Categorization |
---|---|---|
1 | 1 | Y |
1 | 2 | Y |
1 | 3 | R |
2 | 1 | R |
2 | 2 | R |
3 | 1 | Y |
3 | 2 | Y |
4 | 1 | Y |
4 | 2 | |
4 | 3 | R |
I would like to aggregate the data to the following count format:
Subject | Cat_Y | Cat_R |
---|---|---|
1 | 2 | 1 |
2 | 0 | 2 |
3 | 2 | 0 |
4 | 1 | 1 |
I am not sure I the best way would be to do this via a datastep or some proc (e.g means, freq, etc). In total I have ratings for about 60 subjects so I would like to automate the process as much as possible. Any suggestions?
Thanks!
I, too, think that sql would be the easiest. However, if you prefer a datastep and your data are already sorted by subject, then you could use:
data want (keep=subject cat_:);
set have;
by subject;
if first.subject then do;
cat_y=0;
cat_r=0;
end;
if categorization eq 'R' then cat_r+1;
else if categorization eq 'Y' then cat_y+1;
if last.subject then output;
run;
proc sql;
create table want as
select subject, sum(Categorization='R') as cat_r,
sum(Categorization='Y') as cat_y
from have
group by subject;
quit;
I, too, think that sql would be the easiest. However, if you prefer a datastep and your data are already sorted by subject, then you could use:
data want (keep=subject cat_:);
set have;
by subject;
if first.subject then do;
cat_y=0;
cat_r=0;
end;
if categorization eq 'R' then cat_r+1;
else if categorization eq 'Y' then cat_y+1;
if last.subject then output;
run;
Thank you Art and Hai.kuo.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.