- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am fairly new to sas and prefer to work on proc SQL with sas but i have some issues that i am unable to solve with proc SQL as i came to know that sas doesn't support pivot functions of SQL. I have a query and any help from the community is appreciated. I have looked around a lot but can't seem to get my head around this problem. Here's a sample input data:
person_id code_1 code_2 code_3 identifier
111 167 162 183 7803
111 167 162 183 7803
111 167 162 183 7803
111 167 162 183 7934
111 167 162 183 7934
111 167 162 181 15272
111 167 162 181 15272
112 169 157 157 87926
112 167 167 136 88042
112 169 167 136 60183
113 1661 167 136 878618
113 1661 167 136 881027
So what the above data has 5 columns (person id, code_1, code_2, code_3 and an identifier). Now i have been trying to pivot the above data to find the counts of the code_1, code_2, code_3 and also keep in mind the unique identifier as otherwise the counts may be overestimated. I was thinking with SQL' pivot function but can't do it in sas. So I need help to do in sas. Also the output i need is something like this:
person_id | code_167 | code_162 | code_183 | code_169 | code_157 | code_181 | code_136 | code_1661|
111 | 3 | 3 | 2 | 0 | 0 | 1 | 0 | 0 |
112 | 3 | 0 | 0 | 2 | 2 | 0 | 2 | 0 |
113 | 2 | 0 | 0 | 0 | 0 | 0 | 2 | 2 |
So the above is the output table needed. How it works is, it counts the instances of the occurrences of the code numbers per unique identifier for a member. For example in first three rows we have the same unique identifier for the member and hence the code is counted only once and put in the column. Basically we need to count values only for unique value in the identifier column. Also sometimes for a unique identifier same code value may occur in code_1 column and code_2 column but if the identifier is unique for the individual it will be counted as 2 only. Like for member 112 where 157 occurred twice but since identifier was unique it was counted as two.
Please let me know if you need any further info because i am completely stuck here.
Edit: Just as an additional information,the result has to be used as input for further processes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
take a look at these examples
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-long-to-wide-using-proc-transpose/
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
2. Calculate counts using SQL
3. Then transpose to a wide format.
proc transpose data=have out=long;
by person_ID identifier;
var code_1-code_3;
run;
proc sql;
create table summary as
select person_id, identifier, count(distinct col1) as code_counts
from long
group by person_id, identifier;
quit;
Then transpose again.
Untested code, but hopefully gives you the idea.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Reeza wrote:
1. First transpose it to a fully long format.
2. Calculate counts using SQL
3. Then transpose to a wide format.
If the output is not to be used as input to another procedure but for people to read then likely either Proc Report or Tabulate can display in a wide format and likely do the counts as well.