Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Need help aggregating data for Kappa statistic

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-18-2014 04:17 PM

Hello all, I have the following data format. Each subject could be rated in either a Y or R category.

Subject | Rater | Categorization |
---|---|---|

1 | 1 | Y |

1 | 2 | Y |

1 | 3 | R |

2 | 1 | R |

2 | 2 | R |

3 | 1 | Y |

3 | 2 | Y |

4 | 1 | Y |

4 | 2 | |

4 | 3 | R |

I would like to aggregate the data to the following count format:

Subject | Cat_Y | Cat_R |
---|---|---|

1 | 2 | 1 |

2 | 0 | 2 |

3 | 2 | 0 |

4 | 1 | 1 |

I am not sure I the best way would be to do this via a datastep or some proc (e.g means, freq, etc). In total I have ratings for about 60 subjects so I would like to automate the process as much as possible. Any suggestions?

Thanks!

Accepted Solutions

Solution

03-18-2014
04:33 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to spirto

03-18-2014 04:33 PM

I, too, think that sql would be the easiest. However, if you prefer a datastep and your data are already sorted by subject, then you could use:

data want (keep=subject cat_;

set have;

by subject;

if first.subject then do;

cat_y=0;

cat_r=0;

end;

if categorization eq 'R' then cat_r+1;

else if categorization eq 'Y' then cat_y+1;

if last.subject then output;

run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to spirto

03-18-2014 04:29 PM

proc sql;

create table want as

select subject, sum(Categorization='R') as cat_r,

sum(Categorization='Y') as cat_y

from have

group by subject;

quit;

Solution

03-18-2014
04:33 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to spirto

03-18-2014 04:33 PM

I, too, think that sql would be the easiest. However, if you prefer a datastep and your data are already sorted by subject, then you could use:

data want (keep=subject cat_;

set have;

by subject;

if first.subject then do;

cat_y=0;

cat_r=0;

end;

if categorization eq 'R' then cat_r+1;

else if categorization eq 'Y' then cat_y+1;

if last.subject then output;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

03-18-2014 04:45 PM

Thank you Art and Hai.kuo.