BookmarkSubscribeRSS Feed
jbje
Calcite | Level 5

Hello,


I am hoping someone can help me simplify my code. 
In short, I am trying to summarize costs per patient per event type, depending on when the event happened (specifically, within 90 days after randomization, within 1 year, within 2 years etc... The sums should therefore be cumulative.) 

 

Each event has a cost and also a corresponding variable specifying how many days after time 0 (randomization) the event occured. 

 

My dataset looks something like this (simplified):

ID   dis_c_hf_hosp1   dis_c_hf_hosp2   dis_c_hf_hosp3  days_rand2hf_hosp1   days_rand2hf_hosp2   days_rand2hf_hosp3  

1     8000                              7500                     7000                             60                                 180                                 730

2     8500                              8000                     7000                             30                                 180                                 730

3     8000                              7500                     7000                             60                                 180                                 730

4     9000                              8500                     7000                             10                                 180                                 730

5     8000                              7500                     7000                             60                                 180                                 730

 

I am repeating this step for many different clinical events and would therefore like a short, generalizable code. Currently, I am working in a very long macro that goes up to max 11 events per patient (max observed in dataset). It works, but there seems there should be an easier way to do this!

 

%macro time_costs (event, min, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11);

data time_costs_&event; set all_costs; 
/*costs within 90 days, excl. randomized implantation costs*/
c90_&event=0;
if days_rand2&event&min ne . and  days_rand2&event&min <= 90 then do c90_&event=dis_c_&event&min; end;
if days_rand2&event&_2 ne . and  days_rand2&event&_2  <= 90 then do c90_&event=sum(dis_c_&event&min,dis_c_&event&_2); end;
if days_rand2&event&_3 ne . and  days_rand2&event&_3 <= 90 then do c90_&event=sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3); end;
if days_rand2&event&_4 ne . and  days_rand2&event&_4  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4); end;
if days_rand2&event&_5 ne . and  days_rand2&event&_5  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5); end;
if days_rand2&event&_6 ne . and  days_rand2&event&_6  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6); end;
if days_rand2&event&_7 ne . and  days_rand2&event&_7  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7 ); end;
if days_rand2&event&_8 ne . and  days_rand2&event&_8  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8); end;
if days_rand2&event&_9 ne . and  days_rand2&event&_9  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9 ); end;
if days_rand2&event&_10 ne . and  days_rand2&event&_10  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9, dis_c_&event&_10 ); end;
if days_rand2&event&_11 ne . and  days_rand2&event&_11 <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9, dis_c_&event&_10, dis_c_&event&_11); end;
/*costs within 1 year, excl. randomized implantation costs (also includes the first 90 days!!)*/
c365_&event=0;
if days_rand2&event&min ne . and  days_rand2&event&min <= 365 then do c365_&event=dis_c_&event&min; end;
if days_rand2&event&_2 ne . and  days_rand2&event&_2  <= 365 then do c365_&event=sum(dis_c_&event&min,dis_c_&event&_2); end;
if days_rand2&event&_3 ne . and  days_rand2&event&_3 <= 365 then do c365_&event=sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3); end;
if days_rand2&event&_4 ne . and  days_rand2&event&_4  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4); end;
if days_rand2&event&_5 ne . and  days_rand2&event&_5  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5); end;
if days_rand2&event&_6 ne . and  days_rand2&event&_6  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6); end;
if days_rand2&event&_7 ne . and  days_rand2&event&_7  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7 ); end;
if days_rand2&event&_8 ne . and  days_rand2&event&_8  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8); end;
if days_rand2&event&_9 ne . and  days_rand2&event&_9  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9 ); end;
if days_rand2&event&_10 ne . and  days_rand2&event&_10  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9, dis_c_&event&_10 ); end;
if days_rand2&event&_11 ne . and  days_rand2&event&_11 <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9, dis_c_&event&_10, dis_c_&event&_11); end;
/*costs between year 1-2*/
c730_&event=0;
if days_rand2&event&min ne . and  days_rand2&event&min <= 730 then do c730_&event=dis_c_&event&min; end;
if days_rand2&event&_2 ne . and  days_rand2&event&_2  <= 730 then do c730_&event=sum(dis_c_&event&min,dis_c_&event&_2); end;
if days_rand2&event&_3 ne . and  days_rand2&event&_3 <= 730 then do c730_&event=sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3); end;
if days_rand2&event&_4 ne . and  days_rand2&event&_4  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4); end;
if days_rand2&event&_5 ne . and  days_rand2&event&_5  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5); end;
if days_rand2&event&_6 ne . and  days_rand2&event&_6  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6); end;
if days_rand2&event&_7 ne . and  days_rand2&event&_7  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7 ); end;
if days_rand2&event&_8 ne . and  days_rand2&event&_8  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8); end;
if days_rand2&event&_9 ne . and  days_rand2&event&_9  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9 ); end;
if days_rand2&event&_10 ne . and  days_rand2&event&_10  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9, dis_c_&event&_10 ); end;
if days_rand2&event&_11 ne . and  days_rand2&event&_11 <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9, dis_c_&event&_10, dis_c_&event&_11); end;
/*costs between year 2-3*/
/*costs between year 3-4*/
/*costs between year 4-5*/
/*costs between year 5-6*/
/*formatting costs*/
format c90_&event c365_&event c730_&event 20.;
run;

%mend time_costs;

%time_costs (hf_hosp, 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11);

Thank you!

 

- SAS rookie

 

 

 


Working in SAS 9.4. 

8 REPLIES 8
Reeza
Super User

If you can post a smaller fully worked example I think using MultiLabel Formats and PROC MEANS will do it for you. 

Please post some sample input data - 2 to 3 variables are fine - and the expected output that aligns with your input data. Also some clarifications around variable names may help....I can probably make an educated guess, but it's easier if you clarify it. 

 


@jbje wrote:

Hello,


I am hoping someone can help me simplify my code. 
In short, I am trying to summarize costs per patient per event type, depending on when the event happened (specifically, within 90 days after randomization, within 1 year, within 2 years etc... The sums should therefore be cumulative.) 

 

Each event has a cost and also a corresponding variable specifying how many days after time 0 (randomization) the event occured. 

 

My dataset looks something like this (simplified):

ID   dis_c_hf_hosp1   dis_c_hf_hosp2   dis_c_hf_hosp3  days_rand2hf_hosp1   days_rand2hf_hosp2   days_rand2hf_hosp3  

1     8000                              7500                     7000                             60                                 180                                 730

2     8500                              8000                     7000                             30                                 180                                 730

3     8000                              7500                     7000                             60                                 180                                 730

4     9000                              8500                     7000                             10                                 180                                 730

5     8000                              7500                     7000                             60                                 180                                 730

 

I am repeating this step for many different clinical events and would therefore like a short, generalizable code. Currently, I am working in a very long macro that goes up to max 11 events per patient (max observed in dataset). It works, but there seems there should be an easier way to do this!

 

%macro time_costs (event, min, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11);

data time_costs_&event; set all_costs; 
/*costs within 90 days, excl. randomized implantation costs*/
c90_&event=0;
if days_rand2&event&min ne . and  days_rand2&event&min <= 90 then do c90_&event=dis_c_&event&min; end;
if days_rand2&event&_2 ne . and  days_rand2&event&_2  <= 90 then do c90_&event=sum(dis_c_&event&min,dis_c_&event&_2); end;
if days_rand2&event&_3 ne . and  days_rand2&event&_3 <= 90 then do c90_&event=sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3); end;
if days_rand2&event&_4 ne . and  days_rand2&event&_4  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4); end;
if days_rand2&event&_5 ne . and  days_rand2&event&_5  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5); end;
if days_rand2&event&_6 ne . and  days_rand2&event&_6  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6); end;
if days_rand2&event&_7 ne . and  days_rand2&event&_7  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7 ); end;
if days_rand2&event&_8 ne . and  days_rand2&event&_8  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8); end;
if days_rand2&event&_9 ne . and  days_rand2&event&_9  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9 ); end;
if days_rand2&event&_10 ne . and  days_rand2&event&_10  <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9, dis_c_&event&_10 ); end;
if days_rand2&event&_11 ne . and  days_rand2&event&_11 <= 90 then do c90_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9, dis_c_&event&_10, dis_c_&event&_11); end;
/*costs within 1 year, excl. randomized implantation costs (also includes the first 90 days!!)*/
c365_&event=0;
if days_rand2&event&min ne . and  days_rand2&event&min <= 365 then do c365_&event=dis_c_&event&min; end;
if days_rand2&event&_2 ne . and  days_rand2&event&_2  <= 365 then do c365_&event=sum(dis_c_&event&min,dis_c_&event&_2); end;
if days_rand2&event&_3 ne . and  days_rand2&event&_3 <= 365 then do c365_&event=sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3); end;
if days_rand2&event&_4 ne . and  days_rand2&event&_4  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4); end;
if days_rand2&event&_5 ne . and  days_rand2&event&_5  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5); end;
if days_rand2&event&_6 ne . and  days_rand2&event&_6  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6); end;
if days_rand2&event&_7 ne . and  days_rand2&event&_7  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7 ); end;
if days_rand2&event&_8 ne . and  days_rand2&event&_8  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8); end;
if days_rand2&event&_9 ne . and  days_rand2&event&_9  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9 ); end;
if days_rand2&event&_10 ne . and  days_rand2&event&_10  <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9, dis_c_&event&_10 ); end;
if days_rand2&event&_11 ne . and  days_rand2&event&_11 <= 365 then do c365_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9, dis_c_&event&_10, dis_c_&event&_11); end;
/*costs between year 1-2*/
c730_&event=0;
if days_rand2&event&min ne . and  days_rand2&event&min <= 730 then do c730_&event=dis_c_&event&min; end;
if days_rand2&event&_2 ne . and  days_rand2&event&_2  <= 730 then do c730_&event=sum(dis_c_&event&min,dis_c_&event&_2); end;
if days_rand2&event&_3 ne . and  days_rand2&event&_3 <= 730 then do c730_&event=sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3); end;
if days_rand2&event&_4 ne . and  days_rand2&event&_4  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4); end;
if days_rand2&event&_5 ne . and  days_rand2&event&_5  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5); end;
if days_rand2&event&_6 ne . and  days_rand2&event&_6  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6); end;
if days_rand2&event&_7 ne . and  days_rand2&event&_7  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7 ); end;
if days_rand2&event&_8 ne . and  days_rand2&event&_8  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8); end;
if days_rand2&event&_9 ne . and  days_rand2&event&_9  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9 ); end;
if days_rand2&event&_10 ne . and  days_rand2&event&_10  <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9, dis_c_&event&_10 ); end;
if days_rand2&event&_11 ne . and  days_rand2&event&_11 <= 730 then do c730_&event=
	sum(dis_c_&event&min, dis_c_&event&_2, dis_c_&event&_3, dis_c_&event&_4,  dis_c_&event&_5, dis_c_&event&_6,dis_c_&event&_7, 
	dis_c_&event&_8,dis_c_&event&_9, dis_c_&event&_10, dis_c_&event&_11); end;
/*costs between year 2-3*/
/*costs between year 3-4*/
/*costs between year 4-5*/
/*costs between year 5-6*/
/*formatting costs*/
format c90_&event c365_&event c730_&event 20.;
run;

%mend time_costs;

%time_costs (hf_hosp, 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11);

Thank you!

 

- SAS rookie

 

 

 


Working in SAS 9.4. 


 

jbje
Calcite | Level 5

Hi Reeza,


Thanks for you reply! I'll try to be more specific: 

 

Variables:

dis_c_hf_hosp1 = discounted cost of 1st heart failure (hf) hospitalization

dis_c_hf_hosp2 = discounted cost of 2nd heart failure (hf) hospitalization 

(etc.)

days_rand2hf_hosp1 = days from randomization to 1st heart failure (hf) hospitalization

days_rand2hf_hosp12 = days from randomization to 2nd heart failure (hf) hospitalization

(etc.)

c_90d_hf_hosp = cumulative costs within 90 days due to heart failure hospitalizations

c_1year_hf_hosp=cumulative costs within 1 year due to heart failure  hospitalization

(etc)

 

Variables I have: 

ID   dis_c_hf_hosp1    dis_c_hf_hosp2   days_rand2hf_hosp1    days_rand2hf_hosp2

1           8000                     7500                         30                                    180

2           7500                     7000                         180                                   400

(etc)

 

Additional variables I want:

ID ...    c_90d_hf_hosp     c_1year_hf_hosp

1                  8000                             15500 

2                     0                                  7500

(etc)

 

Thank you!

 

Astounding
PROC Star

Another educated guess:  You have transformed an easy problem into a horribly complex problem, by the way you are storing your data.  You would find this problem much easier if you stored a separate observation for each event.  So the first observation would become 3 observations:

 

ID=1  Event=1  Disc_hf_hosp = 8000  Days_randhf_hosp = 60

ID=1  Event=2  Disc_hf_hosp = 7500  Days_randhf_hosp = 180

ID=1  Event=3  Disc_hf_hosp = 7000  Days_randhf_hosp = 730

 

I'm not positive about this particular case, because I'm not 100% sure where you are headed with this as your intended outcome.  But it's very likely you can make your life easy by storing the data differently.

ballardw
Super User

@Astounding wrote:

Another educated guess:  You have transformed an easy problem into a horribly complex problem, by the way you are storing your data.  You would find this problem much easier if you stored a separate observation for each event.  So the first observation would become 3 observations:

 

ID=1  Event=1  Disc_hf_hosp = 8000  Days_randhf_hosp = 60

ID=1  Event=2  Disc_hf_hosp = 7500  Days_randhf_hosp = 180

ID=1  Event=3  Disc_hf_hosp = 7000  Days_randhf_hosp = 730

 

I'm not positive about this particular case, because I'm not 100% sure where you are headed with this as your intended outcome.  But it's very likely you can make your life easy by storing the data differently.


I was also guessing that there may be more than one hospital  and that there should be a Hosp= as well.

jbje
Calcite | Level 5

Hi Astounding, 

I agree, the question should be really simple to answer and I probably am complicating things by the way I have programmed the dataset up until now.  I have also considered if changing the format to a long format  would help, but haven't been able to figure out a solution yet.

 

The outcomes I am interested in are the sum of medical costs within 90 days, 1 year, 2 years etc. Heart failure hospitalization is just one of approx. 40 event types, that would have to be summarized in the end. In fact, the distribution of costs by event type is not necessary, so maybe in the long format I could just summarize the cost of ALL events happening within 90 days, 365 days etc. 

 

I will have a go at it. Thanks for your input!

 

ballardw
Super User

@jbje wrote:

Hi Astounding, 

I agree, the question should be really simple to answer and I probably am complicating things by the way I have programmed the dataset up until now.  I have also considered if changing the format to a long format  would help, but haven't been able to figure out a solution yet.

 

The outcomes I am interested in are the sum of medical costs within 90 days, 1 year, 2 years etc. Heart failure hospitalization is just one of approx. 40 event types, that would have to be summarized in the end. In fact, the distribution of costs by event type is not necessary, so maybe in the long format I could just summarize the cost of ALL events happening within 90 days, 365 days etc. 

 

I will have a go at it. Thanks for your input!

 


It would help to show what the desired results from your example data might be.

 

Does this look at all helpful?

data have;
input ID dis_c_hf_hosp1 dis_c_hf_hosp2 dis_c_hf_hosp3 
         days_rand2hf_hosp1 days_rand2hf_hosp2 days_rand2hf_hosp3 
;
datalines;
1 8000 7500 7000 60 180 730
2 8500 8000 7000 30 180 730
3 8000 7500 7000 60 180 730
4 9000 8500 7000 10 180 730
5 8000 7500 7000 60 180 730
;
run;
data trans;
   set have;
   array dis  dis_c_hf_hosp1 dis_c_hf_hosp2 dis_c_hf_hosp3;
   array day  days_rand2hf_hosp1 days_rand2hf_hosp2 days_rand2hf_hosp3;
   do event= 1 to dim(dis);
      type = 'HF';
      dis_c= dis[event];
      days = day[event];
      cumcost = sum(cumcost,dis_c);
      if dis_c then output;
   end;
   keep id event type dis_c days cumcost;
run;

Without actually seeing your variable names it isn't possible to guess what other "type" of treatment or what ever this is. You mentioned something about heart failure so I assume that is what the HF stands for.

 

The above turns each occurrence of a dis_c_hf into an output observation and calculates a cumulative total for the dis_c.

 

I am not sure exactly what summaries you want so can't go much further.

jbje
Calcite | Level 5

Hi ballardw,

 

Thank you, this was absolutely helpful. Sorry I have not been explicit enough - I am new to this community. I'll try to explain to next steps better:

 

Heart failure (hf_hosp) is only one out of about 40 events that can occur during follow-up, and each event can occur up to 11 times at different timepoints. I am interested in the cumulative cost of each event within specific time periods (90 days, 1 year - 8 years) and subsequently, I want summarize the events' cumulative costs to a total cumulative cost (per patient) within 90 days etc. 

 

The coding of the other events is identical to the coding of heart failure. Here's an abbreviated dummy dataset:

 

 

/*dummy dataset*/
data have;
input ID dis_c_hf_hosp1 dis_c_hf_hosp2 dis_c_hf_hosp3 /*discounted costs for heart failure hospitalization events, no 1-3*/
         days_rand2hf_hosp1 days_rand2hf_hosp2 days_rand2hf_hosp3 /*days from randomization to heart failure hospitalizations*/
         dis_c_endocarditis1 dis_c_endocarditis2 dis_c_endocarditis3 /*discounted costs for endocarditis events, no 1-3*/
         days_rand2endocarditis1 days_rand2endocarditis2 days_rand2endocarditis3 /*days from randomization to endocarditis events*/
	 dis_c_mi1 dis_c_mi2 dis_c_mi3 /*discounted costs for myocardial infarction events, no 1-3*/
         days_rand2mi1 days_rand2mi2 days_rand2mi3 /*days from randomization to myocardial infarction 
	/*etc*/
;
datalines;
1 8000 7500 7000 60 180 730 28000 27500 27000 50 170 720 18000 17500 17000 70 190 740
2 8500 8000 7000 30 180 730 28000 27500 27000 50 170 720 18000 17500 17000 70 190 740
3 8000 7500 7000 60 180 730 28000 27500 27000 50 170 720 18000 17500 17000 70 190 740
4 9000 8500 7000 10 180 730 28000 27500 27000 50 170 720 18000 17500 17000 70 190 740
5 8000 7500 7000 60 180 730 28000 27500 27000 50 170 720 18000 17500 17000 70 190 740
;
run;

 

With your approach and using a macro, I am able to create a long dataset per event type, but I am not sure how to get from there to getting the cumulative costs per event type and summarize them all. 

 

Variables wanted:

c90_hf_hosp (cumulative costs of heart failure within 90 days)

c1y_hf_hosp

c2y_hf_hosp

etc...

c90_endocarditis (cumulative costs of endocarditis within 90 days)

c1y_endocarditis

c2y_endocarditis

etc...

c90_mi  (cumulative costs of myocardial infarction within 90 days)

c1y_mi

c2y_mi

etc...

c90_all_events (total cumulative costs within 90 days)

c1y_all_events

c2y_all_events

etc...

 

Hope this makes sense and thank you again!

Reeza's approach is somewhat similar, but I find the multilabel formats difficult to work with for the subsequent summation, but maybe that's just me!

 

Reeza
Super User

Here's an example of how this could be done. I first restructure the data into a long format using an array to do it in one step. 

Then I create the custom format for the 90 and one year intervals, then you're good to go.

 

*type in sample data;

data have;
    infile cards dlm=',';
    input ID dis_c_hf_hosp1 dis_c_hf_hosp2 days_rand2hf_hosp1 days_rand2hf_hosp2;
    cards;
1, 8000, 7500, 30, 180
2, 7500, 7000, 180, 400
;
    ;
    ;;
run;

*transpose to a long format;

data flipped;
    set have;
    array dis(*) dis_:;
    array days(*) days_:;

    do Hospital=1 to dim(dis);
        Discounted_Cost=dis(hospital);
        Days_from_Rand=days(hospital);
        output;
    end;
    drop dis_: days_r:;
run;

*create format for periods of interest;

proc format;
    value duration_fmt (multilabel) low - 90='Within 90 days' 
        low - 365='Within one year' other='CHECKME';
    *I add this to catch things that don't fit for checking;
run;

*summarize totals;

proc means data=flipped n sum nway;
    class ID days_from_rand / mlf;
    *note mlf used for formatting;
    format days_from_rand duration_fmt.;
    var Discounted_Cost;
    ods output summary=want;
    *want is name of output data set;
run;

Your data, in either postings don't really explain the event issue but hopefully this gets you started. 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1365 views
  • 2 likes
  • 4 in conversation