I am looking for a solution to collapse claims into a single row by claim number and in need of assistance.
Have:
CLAIM_NUM | ER_ENC_DATE | CT_SCAN_DATE | ER_FLAG | HEAD_INJ_FLAG | CT_SCAN_FLAG | CT_SCAN_COST |
1 | . | . | 0 | 1 | 0 | 0 |
1 | . | 01/01/2014 | 0 | 0 | 1 | $200 |
1 | 01/01/2014 | . | 1 | 0 | 0 | 0 |
2 | . | . | 0 | 1 | 0 | 0 |
2 | . | 01/06/2014 | 0 | 0 | 1 | $100 |
2 | 01/04/2014 | . | 1 | 0 | 0 | 0 |
3 | . | . | 0 | 1 | 0 | 0 |
3 | . | . | 0 | 0 | 0 | 0 |
3 | 01/10/2014 | . | 1 | 0 | 0 | 0 |
4 | . | 0 | 0 | 0 | 0 | |
4 | . | 0 | 0 | 0 | 0 | |
4 | 01/10/2014 | . | 1 | 0 | 0 | 0 |
Want:
CLAIM_NUM | ER_ENC_DATE | CT_SCAN_DATE | ER_FLAG | HEAD_INJ_FLAG | CT_SCAN_FLAG | CT_SCAN_COST |
1 | 01/01/2014 | 01/01/2014 | 1 | 1 | 1 | $200 |
2 | 01/04/2014 | 01/06/2014 | 1 | 1 | 1 | $100 |
3 | 01/10/2014 | . | 1 | 1 | 0 | 0 |
4 | 01/10/2014 | . | 1 | 0 | 0 | 0 |
If all your variables are numeric, this is easy. Assuming your data is already sorted by CLAIM_NUM:
proc summary data=have;
by claim_num;
var list of numeric variables;
output out=want (drop=_type_ _freq_) max=;
run;
If some of your variables are character, you will have to leave them out of the VAR statement in PROC SUMMARY. Instead, bring them in using a second step:
data want;
update want have (keep=claim_num list of character variables);
by claim_num;
run;
That should do it. Good luck.
You can do this in several ways e.g. retain - arrays etc. If there is fixed 3 rows then maybe (not really though too much about the joins here, just typing quickly as leaving):
proc sql;
create table WANT as
select COALESCE(A.CLAIM_NUM,B.CLAIM_NUM,C.CLAIM_NUM) as CLAIM_NUM,
A.ER_ENC_DATE,
B.CT_SCAN_DATE,
A.ER_FLAG,
C.HEAD_INJ_FLAG,
...
from (select * from HAVE where ER_ENC_DATE is not null) A
full join (select * from HAVE where CT_SCAN_DATE is not null) B
on A.CLAIM_NUM=B.CLAIM_NUM
full join (select * from HAVE where HEAD_INJ_FLAG=1) C
on COALESCE(A.CLAIM_NUM,B.CLAIM_NUM)=C.CLAIM_NUM;
quit;
If all your variables are numeric, this is easy. Assuming your data is already sorted by CLAIM_NUM:
proc summary data=have;
by claim_num;
var list of numeric variables;
output out=want (drop=_type_ _freq_) max=;
run;
If some of your variables are character, you will have to leave them out of the VAR statement in PROC SUMMARY. Instead, bring them in using a second step:
data want;
update want have (keep=claim_num list of character variables);
by claim_num;
run;
That should do it. Good luck.
Thank you for the simple and effective solution!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.