I am looking for a solution to collapse claims into a single row by claim number and in need of assistance.
Have:
CLAIM_NUM | ER_ENC_DATE | CT_SCAN_DATE | ER_FLAG | HEAD_INJ_FLAG | CT_SCAN_FLAG | CT_SCAN_COST |
1 | . | . | 0 | 1 | 0 | 0 |
1 | . | 01/01/2014 | 0 | 0 | 1 | $200 |
1 | 01/01/2014 | . | 1 | 0 | 0 | 0 |
2 | . | . | 0 | 1 | 0 | 0 |
2 | . | 01/06/2014 | 0 | 0 | 1 | $100 |
2 | 01/04/2014 | . | 1 | 0 | 0 | 0 |
3 | . | . | 0 | 1 | 0 | 0 |
3 | . | . | 0 | 0 | 0 | 0 |
3 | 01/10/2014 | . | 1 | 0 | 0 | 0 |
4 | . | 0 | 0 | 0 | 0 | |
4 | . | 0 | 0 | 0 | 0 | |
4 | 01/10/2014 | . | 1 | 0 | 0 | 0 |
Want:
CLAIM_NUM | ER_ENC_DATE | CT_SCAN_DATE | ER_FLAG | HEAD_INJ_FLAG | CT_SCAN_FLAG | CT_SCAN_COST |
1 | 01/01/2014 | 01/01/2014 | 1 | 1 | 1 | $200 |
2 | 01/04/2014 | 01/06/2014 | 1 | 1 | 1 | $100 |
3 | 01/10/2014 | . | 1 | 1 | 0 | 0 |
4 | 01/10/2014 | . | 1 | 0 | 0 | 0 |
If all your variables are numeric, this is easy. Assuming your data is already sorted by CLAIM_NUM:
proc summary data=have;
by claim_num;
var list of numeric variables;
output out=want (drop=_type_ _freq_) max=;
run;
If some of your variables are character, you will have to leave them out of the VAR statement in PROC SUMMARY. Instead, bring them in using a second step:
data want;
update want have (keep=claim_num list of character variables);
by claim_num;
run;
That should do it. Good luck.
You can do this in several ways e.g. retain - arrays etc. If there is fixed 3 rows then maybe (not really though too much about the joins here, just typing quickly as leaving):
proc sql;
create table WANT as
select COALESCE(A.CLAIM_NUM,B.CLAIM_NUM,C.CLAIM_NUM) as CLAIM_NUM,
A.ER_ENC_DATE,
B.CT_SCAN_DATE,
A.ER_FLAG,
C.HEAD_INJ_FLAG,
...
from (select * from HAVE where ER_ENC_DATE is not null) A
full join (select * from HAVE where CT_SCAN_DATE is not null) B
on A.CLAIM_NUM=B.CLAIM_NUM
full join (select * from HAVE where HEAD_INJ_FLAG=1) C
on COALESCE(A.CLAIM_NUM,B.CLAIM_NUM)=C.CLAIM_NUM;
quit;
If all your variables are numeric, this is easy. Assuming your data is already sorted by CLAIM_NUM:
proc summary data=have;
by claim_num;
var list of numeric variables;
output out=want (drop=_type_ _freq_) max=;
run;
If some of your variables are character, you will have to leave them out of the VAR statement in PROC SUMMARY. Instead, bring them in using a second step:
data want;
update want have (keep=claim_num list of character variables);
by claim_num;
run;
That should do it. Good luck.
Thank you for the simple and effective solution!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.