I am looking for a solution to collapse claims into a single row by claim number and in need of assistance.
Have:
CLAIM_NUM | ER_ENC_DATE | CT_SCAN_DATE | ER_FLAG | HEAD_INJ_FLAG | CT_SCAN_FLAG | CT_SCAN_COST |
1 | . | . | 0 | 1 | 0 | 0 |
1 | . | 01/01/2014 | 0 | 0 | 1 | $200 |
1 | 01/01/2014 | . | 1 | 0 | 0 | 0 |
2 | . | . | 0 | 1 | 0 | 0 |
2 | . | 01/06/2014 | 0 | 0 | 1 | $100 |
2 | 01/04/2014 | . | 1 | 0 | 0 | 0 |
3 | . | . | 0 | 1 | 0 | 0 |
3 | . | . | 0 | 0 | 0 | 0 |
3 | 01/10/2014 | . | 1 | 0 | 0 | 0 |
4 | . | 0 | 0 | 0 | 0 | |
4 | . | 0 | 0 | 0 | 0 | |
4 | 01/10/2014 | . | 1 | 0 | 0 | 0 |
Want:
CLAIM_NUM | ER_ENC_DATE | CT_SCAN_DATE | ER_FLAG | HEAD_INJ_FLAG | CT_SCAN_FLAG | CT_SCAN_COST |
1 | 01/01/2014 | 01/01/2014 | 1 | 1 | 1 | $200 |
2 | 01/04/2014 | 01/06/2014 | 1 | 1 | 1 | $100 |
3 | 01/10/2014 | . | 1 | 1 | 0 | 0 |
4 | 01/10/2014 | . | 1 | 0 | 0 | 0 |
If all your variables are numeric, this is easy. Assuming your data is already sorted by CLAIM_NUM:
proc summary data=have;
by claim_num;
var list of numeric variables;
output out=want (drop=_type_ _freq_) max=;
run;
If some of your variables are character, you will have to leave them out of the VAR statement in PROC SUMMARY. Instead, bring them in using a second step:
data want;
update want have (keep=claim_num list of character variables);
by claim_num;
run;
That should do it. Good luck.
You can do this in several ways e.g. retain - arrays etc. If there is fixed 3 rows then maybe (not really though too much about the joins here, just typing quickly as leaving):
proc sql;
create table WANT as
select COALESCE(A.CLAIM_NUM,B.CLAIM_NUM,C.CLAIM_NUM) as CLAIM_NUM,
A.ER_ENC_DATE,
B.CT_SCAN_DATE,
A.ER_FLAG,
C.HEAD_INJ_FLAG,
...
from (select * from HAVE where ER_ENC_DATE is not null) A
full join (select * from HAVE where CT_SCAN_DATE is not null) B
on A.CLAIM_NUM=B.CLAIM_NUM
full join (select * from HAVE where HEAD_INJ_FLAG=1) C
on COALESCE(A.CLAIM_NUM,B.CLAIM_NUM)=C.CLAIM_NUM;
quit;
If all your variables are numeric, this is easy. Assuming your data is already sorted by CLAIM_NUM:
proc summary data=have;
by claim_num;
var list of numeric variables;
output out=want (drop=_type_ _freq_) max=;
run;
If some of your variables are character, you will have to leave them out of the VAR statement in PROC SUMMARY. Instead, bring them in using a second step:
data want;
update want have (keep=claim_num list of character variables);
by claim_num;
run;
That should do it. Good luck.
Thank you for the simple and effective solution!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.